ZFS - Building, Testing, and Benchmarking
by Matt Breitbach on October 5, 2010 4:33 PM EST- Posted in
- IT Computing
- Linux
- NAS
- Nexenta
- ZFS
Other Cool ZFS Features
There are many items that we have not touched on in this article, and those are worthy of mentioning at this time simply because they are enterprise features that are available with OpenSolaris and with Nexenta. These are features that the Promise M610i cannot compete with in any way.
Block Level Deduplication - ZFS can employ block level deduplication, which is to say it can detect identical blocks, and simply keep one copy of the data. This can significantly reduce storage costs, and possibly improve performance when the circumstances allow. One group that recently deployed a Nexenta instance had originally configured the system for 2TB of storage. They were using 1.4TB at the time and wanted to have room to grow. By enabling deduplication they were able to shrink the actual used space on the drives to just under 800GB. This also has implications when randomly accessing data. If you have multiple copies of the same data spread out all over a hard drive, it has to seek to find that data. If it's actually only stored in one place, you can potentially reduce the number of seeks that your drives have to do to retrieve the data.
Compression - ZFS also offers native compression similar to gzip compression. This allows you to save space at the expense of CPU and memory usage. For a system that is simply used for archiving data, this could be a great money and space saver. For a system that is being actively used as a database server, compression may not be the best idea.
Snapshot Shipping - OpenSolaris and Nexenta also offer snapshot shipping. This allows you to snapshot the entire storage array and back it up via SSH to a remote server. Once you ship the initial snapshot, only incremental data changes are shipped, so you can conserve bandwidth while still replicating your data to a remote location. Keep in mind that this is not a block level replication, but a point in time snapshot, so as soon as the snapshot is taken, any new data is not shipped to the remote system.
102 Comments
View All Comments
diamondsw2 - Tuesday, October 5, 2010 - link
You're not doing your readers any favors by conflating the terms NAS and SAN. NAS devices (such as what you've described here) are Network Attached Storage, accessed over Ethernet, and usually via fileshares (NFS, CIFS, even AFP) with file-level access. SAN is Storage Area Network, nearly always implemented with Fibre Channel, and offers block-level access. About the only gray area is that iSCSI allows block-level access to a NAS, but that doesn't magically turn it into a SAN with a storage fabric.Honestly, given the problems I've seen with NAS devices and the burden a well-designed one will put on a switch backplane, I just don't see the point for anything outside the smallest installations where the storage is tied to a handful of servers. By the time you have a NAS set up *well* you're inevitably going to start taxing your switches, which leads to setting up dedicated storage switches, which means... you might as well have set up a real SAN with 8Gbps fibre channel and been done with it.
NAS is great for home use - no special hardware and cabling, and options as cheap as you want to go - but it's a pretty poor way to handle centralized storage in the datacenter.
cdillon - Tuesday, October 5, 2010 - link
The terms NAS and SAN have become rightfully mixed, because modern storage appliances can do the jobs of both. Add some FC HBAs to the above ZFS storage system and create some FC Targets using Comstar in OpenSolaris or Nexenta and guess what? You've got a "SAN" box. Nexenta can even do active/active failover and everything else that makes it worthy of being called a true "Enterprise SAN" solution.I like our FC SAN here, but holy cow is it expensive, and its not getting any cheaper as time goes on. I foresee iSCSI via plain 10G Ethernet and also FCoE (which is 10G Ethernet + FC sharing the same physical HBA and data link) completely taking over the Fibre Channel market within the next decade, which will only serve to completely erase the line between "NAS" and "SAN".
mbreitba - Tuesday, October 5, 2010 - link
The systems as configured in this article are block level storage devices accessed over a gigabit network using iSCSI. I would strongly consider that a SAN device over a NAS device. Also, the storage network is segregated onto a separate network already, isolated from the primary network.We also backed this device with 20Gbps InfiniBand, but had issues getting the IB network stable, so we did not include it in the article.
Maveric007 - Tuesday, October 5, 2010 - link
I find iscsi is closer to a NAS then a SAN to be honest. The performance difference between iscsi and san are much further away then iscsi and nas.Mattbreitbach - Tuesday, October 5, 2010 - link
iSCSI is block based storage, NAS is file based. The transport used is irrelevent. We could use iSCSI over 10GbE, or over InfiniBand, which would increase the performance significantly, and probably exceed what is available on the most expensive 8Gb FC available.mino - Tuesday, October 5, 2010 - link
You are confusing the NAS vs. SAN terminology with the interconnects terminology and vice versa.SAN, NAS, DAS ... are abstract methods how a data client accesses the stored data.
--Network Attached Storage (NAS), per definition, is an file/entity-based data storage solution.
- - - It is _usually_but_not_necessarily_ connected to a general-purpose data network
--Storage Area Network(SAN), per definition, is a block-access-based data storage solution.
- - - It is _usually_but_not_necessarily_THE_ dedicated data network.
Ethernet, FC, Infiniband, ... are physical data conduits, they are the ones who define in which PERFORMANCE class a solution belongs
iSCSI, SAS, FC, NFS, CIFS ... are logical conduits, they are the ones who define in which FEATURE CLASS a solution belongs
Today, most storage appliances allow for multiple ways to access the data, many of the simultaneously.
Therefore, presently:
Calling a storage appliance, of whatever type, a "SAN" is pure jargon.
- It has nothing to do with the device "being" a SAN per se
Calling an appliance, of whatever type, a "NAS" means it is/will be used in the NAS role.
- It has nothing to do with the device "being" a NAS per se.
mkruer - Tuesday, October 5, 2010 - link
I think there needs to be a new term called SANNAS or snaz short for snazzy.mmrezaie - Wednesday, October 6, 2010 - link
Thanks, I learned a lot.signal-lost - Friday, October 8, 2010 - link
Depends on the hardware sir.My iSCSI Datacore SAN, pushes 20k iops for the same reason that their ZFS does it (Ram cacheing).
Fibre Channel SANs will always outperform iSCSI run over crappy switching.
Currently Fibre Channel maxes out at 8Gbps in most arrays. Even with MPIO, your better off with an iSCSI system and 10/40Gbps Ethernet if you do it right. Much cheaper, and you don't have to learn an entire new networking model (Fibre Channel or Infiniband).
MGSsancho - Tuesday, October 5, 2010 - link
while technically a SAN you can easily make it a NAS with a simple zfs set sharesmb=on as I am sure you are aware.