Nunyas
Yoda
Offline
Geeky...
The company I work for is a web hosting company, and their 'shared-hosting' environment is one of the most unique, and in theory the most powerful around (though I admit has many pit-falls/short comings). It's a clustered environment, w/ delineated storage segments. Each storage segment is run by Solaris w/ ZFS for the file system. ZFS is a fancy schmancy file system that allows you to grow the size of the FS "dynamically" just by adding new disks when you need more space. It also has built-in capacity for redundancy. Though, I'm not making use of redundancy in my personal installation because the data stored in my ZFS array isn't that important.
ZFS is similar to RAID 0 in that it is a stripped array of disk drives. The stripping improves data throughput of the array. The downside is if you do not have redundancy enabled, you stand to loose a lot of data if 1 drive fails. I've had 1 drive fail to start in my external RAID enclosure; it resulted in the loss of about 4GB of data, which sounds bad, but when there was about 2TB of data in the array, I lost less than 1%. So, not really that bad in a system where the data isn't critical.
Anyway, I built my ZFS array to get a better understanding of the file system, and its use. I built it w/ tax returns and at the time NewEgg had a "special" on 2TB drives (~$70 each) and drives that I already had available. So, building it didn't set me back as much as one would normally expect.
One thing I've noticed with ZFS in a Linux environment is due to the way that Solaris has licensed ZFS, Linux does not have any kernel modules that are capable of reading this type of file system included within the kernel source. So, the most common way around this for end users is to run ZFS on top of FUSE. In my experience, this tends to load up the OS/CPU during large file transfers into the array; I hit loads of 2+ on a quad-core system w/ 8GB of RAM installed during large file transfers. The other option is to use a 3rd party kernel module for ZFS, which requires recompiling a kernel...
The following is ~not~ the full output of my 'df' command. I've narrowed the output to the point of interest :wink:
<div class="ubbcode-block"><div class="ubbcode-header">Code:</div><div class="ubbcode-body ubbcode-pre" ><pre>max@home:~$ df -h
Filesystem Size Used Avail Use% Mounted on
storage/files 9.9T 2.5T 7.4T 25% /storage/files</pre>[/QUOTE]
<div class="ubbcode-block"><div class="ubbcode-header">Code:</div><div class="ubbcode-body ubbcode-pre" ><pre>max@home:~$ sudo zpool status
pool: storage
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
disk/by-id/scsi-SATA_ST32000542AS_5XW13E49 ONLINE 0 0 0
disk/by-id/scsi-SATA_ST32000542AS_5XW168HH ONLINE 0 0 0
disk/by-id/scsi-SATA_ST32000542AS_5XW19132 ONLINE 0 0 0
disk/by-id/scsi-SATA_ST32000542AS_5XW1960W ONLINE 0 0 0
disk/by-id/scsi-SATA_WDC_WD20EADS-00_WD-WCAVY0657013 ONLINE 0 0 0
disk/by-id/scsi-SATA_WDC_WD10EACS-00_WD-WCAU41040824 ONLINE 0 0 0
errors: No known data errors</pre>[/QUOTE]
Yeah, 6 drives in the array. It started out w/ just 5, but after I wiped the 1TB drive clean (the SATA_WDC_WD10EACS-00_WD-WCAU41040824 drive) I added it into the pool (array). ZFS allowed me to add the drive into the pool w/o remounting the "drive" and w/o having to reboot the computer.
It does have quite a bit of overhead though. ~10% of the total drive space has disappeared to file system overhead. If you add up the space of the drives, I should have around 11TB of disk space, but the space available for use is 9.9TB. Some of that overhead is going to drive parity and journaling. So, I can forgive it for having such high overhead.
The company I work for is a web hosting company, and their 'shared-hosting' environment is one of the most unique, and in theory the most powerful around (though I admit has many pit-falls/short comings). It's a clustered environment, w/ delineated storage segments. Each storage segment is run by Solaris w/ ZFS for the file system. ZFS is a fancy schmancy file system that allows you to grow the size of the FS "dynamically" just by adding new disks when you need more space. It also has built-in capacity for redundancy. Though, I'm not making use of redundancy in my personal installation because the data stored in my ZFS array isn't that important.
ZFS is similar to RAID 0 in that it is a stripped array of disk drives. The stripping improves data throughput of the array. The downside is if you do not have redundancy enabled, you stand to loose a lot of data if 1 drive fails. I've had 1 drive fail to start in my external RAID enclosure; it resulted in the loss of about 4GB of data, which sounds bad, but when there was about 2TB of data in the array, I lost less than 1%. So, not really that bad in a system where the data isn't critical.
Anyway, I built my ZFS array to get a better understanding of the file system, and its use. I built it w/ tax returns and at the time NewEgg had a "special" on 2TB drives (~$70 each) and drives that I already had available. So, building it didn't set me back as much as one would normally expect.
One thing I've noticed with ZFS in a Linux environment is due to the way that Solaris has licensed ZFS, Linux does not have any kernel modules that are capable of reading this type of file system included within the kernel source. So, the most common way around this for end users is to run ZFS on top of FUSE. In my experience, this tends to load up the OS/CPU during large file transfers into the array; I hit loads of 2+ on a quad-core system w/ 8GB of RAM installed during large file transfers. The other option is to use a 3rd party kernel module for ZFS, which requires recompiling a kernel...
The following is ~not~ the full output of my 'df' command. I've narrowed the output to the point of interest :wink:
<div class="ubbcode-block"><div class="ubbcode-header">Code:</div><div class="ubbcode-body ubbcode-pre" ><pre>max@home:~$ df -h
Filesystem Size Used Avail Use% Mounted on
storage/files 9.9T 2.5T 7.4T 25% /storage/files</pre>[/QUOTE]
<div class="ubbcode-block"><div class="ubbcode-header">Code:</div><div class="ubbcode-body ubbcode-pre" ><pre>max@home:~$ sudo zpool status
pool: storage
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
disk/by-id/scsi-SATA_ST32000542AS_5XW13E49 ONLINE 0 0 0
disk/by-id/scsi-SATA_ST32000542AS_5XW168HH ONLINE 0 0 0
disk/by-id/scsi-SATA_ST32000542AS_5XW19132 ONLINE 0 0 0
disk/by-id/scsi-SATA_ST32000542AS_5XW1960W ONLINE 0 0 0
disk/by-id/scsi-SATA_WDC_WD20EADS-00_WD-WCAVY0657013 ONLINE 0 0 0
disk/by-id/scsi-SATA_WDC_WD10EACS-00_WD-WCAU41040824 ONLINE 0 0 0
errors: No known data errors</pre>[/QUOTE]
Yeah, 6 drives in the array. It started out w/ just 5, but after I wiped the 1TB drive clean (the SATA_WDC_WD10EACS-00_WD-WCAU41040824 drive) I added it into the pool (array). ZFS allowed me to add the drive into the pool w/o remounting the "drive" and w/o having to reboot the computer.
It does have quite a bit of overhead though. ~10% of the total drive space has disappeared to file system overhead. If you add up the space of the drives, I should have around 11TB of disk space, but the space available for use is 9.9TB. Some of that overhead is going to drive parity and journaling. So, I can forgive it for having such high overhead.
Hey Guest!
smilie in place of the real @
Pretty Please - add it to our Events forum(s) and add to the calendar! >> 






