Enterprise NVMe Round-Up 2: SK Hynix, Samsung, DapuStor and DERA
by Billy Tallis on February 14, 2020 11:15 AM ESTQD1 Random Read Performance
Drive throughput with a queue depth of one is usually not advertised, but almost every latency or consistency metric reported on a spec sheet is measured at QD1 and usually for 4kB transfers. When the drive only has one command to work on at a time, there's nothing to get in the way of it offering its best-case access latency. Performance at such light loads is absolutely not what most of these drives are made for, but they have to make it through the easy tests before we move on to the more realistic challenges.
Random read performance at QD1 is mostly determined by the inherent latency of the underlying storage medium. Since most of these SSDs are using 64+ layer 3D TLC, they're all on a fairly even footing. The SK hynix PE6011 is the slowest of the new NVMe drives while the Dapu Haishen3 and Samsung PM1725a are among the fastest.
Power Efficiency in kIOPS/W | Average Power in W |
The drives with big 16-channel controllers have the worst power efficiency at low load, because they idle in the 6-7W range. The DERA SSDs, Samsung PM1725 and Memblaze PBlaze5 are all in the same ballpark. The SK hynix PE6011 is a clear step up, and the Dapu Haishen3s are the most efficient of the new drives on this test. The SATA drives and Samsung's low-power 983 DCT still have higher efficiency ratings because they're under 2W during this test.
Flipping the numbers around to look at latency instead of IOPS, we see that the DERA drives seem to have surprisingly high tail latencies comparable to the old Micron SATA drive and far in excess of any of the other NVMe drives. The rest of the new NVMe drives in our collection have great QoS out to four 9s.
There aren't any surprises when looking at random read performance across a range of block sizes. All of the new NVMe drives have constant IOPS for block sizes of 4kB and smaller, and for larger block sizes IOPS decreases but throughput grows significantly.
QD1 Random Write Performance
Random write performance at QD1 is mostly a matter of delivering the data into the SSD's cache and getting back an acknowledgement; the performance of the NAND flash itself doesn't factor in much until the drive is busier with a higher queue depth. The exception here is the Optane SSD, which doesn't have or need a DRAM cache. Between the fastest and slowest flash-based NVMe SSDs here we're only looking at about a 30% difference. The SK hynix PE6011 and Samsung PM1725a are a bit on the slow side, while the DERA SSDs are among the fastest.
Power Efficiency in kIOPS/W | Average Power in W |
Power draw during this test is generally higher than for the QD1 random read test, but the pattern of bigger SSD controllers being less efficient still mostly holds true. The Dapu Haishen3 and SK hynix PE6011 are the most efficient of our new NVMe drives, and are also helped some by their lower capacity: the 2TB models don't have to spend as much power on keeping DRAM and NAND chips awake.
All the drives start to show elevated tail latency when we go out to four 9s, but the SK hynix PE6011 and Samsung PM1725a also have issues at the 99th percentile level (as does the Intel P4510). The Dapu Haishen3 drives have the best QoS scores on this performance even though their average latency is a few microseconds slower than the fastest flash-based SSDs in this batch.
Looking at random write performance for different block sizes reveals major differences between drives. Everything has obviously been optimized to offer peak IOPS with 4kB block size (except for the Optane SSD). However, several drives do so at the expense of vastly lower performance on sub-4kB block sizes. The DapuStor Haishen3 and DERA SSDs join the Memblaze PBlaze5 on the list of drives that maybe shouldn't even offer the option of operating with 512-byte sector sizes. For those drives, IOPS falls by a factor of 4-5x and they seem to be bottlenecked by doing a read-modify-write cycle in order to support small block writes.
QD1 Sequential Read Performance
When performing sequential reads of 128kB blocks, QD1 isn't enough for any of these drives to really stretch their legs. Unlike consumer SSDs, most of these drives seem to be doing little or no readahead caching, which is probably a reasonable decision for heavily multi-user environments where IO is less predictable. It does lead to lackluster performance numbers, with none of our new drives breaking 1GB/s. The DERA SSDs are fastest of the new bunch, but are only half as fast on this test as the Intel P4510 or Samsung 983 DCT.
Power Efficiency in MB/s/W | Average Power in W |
Even though we're starting to get up to non-trivial throughput with this test, the power efficiency scores are still dominated by the baseline idle power draw of these SSDs. The 16-channel drives are mostly in the 8-9W range (DERA, Samsung PM1725a) while the 8-channel drives are around half that. The DapuStor Haishen3 drives are the most efficient of our new drives, but are still clearly a ways behind the Intel P4510 and Samsung 983 DCT that are much faster on this test.
All of the new NVMe drives in our collection are still showing a lot of performance growth by the time the block size test reaches 1MB reads. At that point, they've all at least caught up with the handful of other drives that performed very well on the QD1/128kB sequential read test, but it's clear that they need either a higher queue depth or even larger block sizes in order to make the most of their theoretical throughput.
QD1 Sequential Write Performance
A few different effects are at play during our QD1 sequential write test. The drives were preconditioned with a few full drive writes before the test, so they're at or near steady-state when this test begins. This leads to the general pattern of larger drives or drives with more overprovisioning performing better, because they can more easily free up a block to accept new writes. However, at QD1 the drives are getting a bit of idle time when waiting on the host system to deliver the next write command, and that results in poor link utilization and fairly low top speeds. It also compresses the spread of scores slightly compared to what the spec sheets indicate we'll see at high queue depths.
The DapuStor Haishen3 drives stand out as the best performers in the 2TB class; they break the pattern of better performance from bigger drives and are performing on par with the 8TB class drives with comparable overprovisioning ratios.
Power Efficiency in MB/s/W | Average Power in W |
The 1.6TB DapuStor Haishen3 H3100 stands out as the most efficient flash-based NVMe SSD on this test, by a fairly wide margin. Its QD1 sequential write performance is similar to the 8TB drives with 16-channel controllers, but the Haishen3 H3100 is also tied for lowest power consumption among the NVMe drives: just under 7W compared to a maximum of over 18W for the 8TB DERA D5437. The Haishen3 H3000's efficiency score is more in line with the rest of the competition, because its lower overprovisioning ratio forces it to spend quite a bit more power on background flash management even at this low queue depth.
In contrast to our random write block size test, for sequential writes extremely poor small-block write performance seems to be the norm rather than the exception; most of these drives don't take kindly to sub-4kB writes. Increasing block sizes past 128kB up to at least 1MB doesn't help the sequential write performance of these drives; in order to hit the speeds advertised on the spec sheets, we need to go beyond QD1.
33 Comments
View All Comments
PaulHoule - Friday, February 14, 2020 - link
"The Samsung PM1725a is strictly speaking outdated, having been succeeded by a PM1725b with newer 3D NAND and a PM1735 with PCIe 4.0. But it's still a flagship model from the top SSD manufacturer, and we don't get to test those very often."Why? If you've got so much ink for DRAMless and other attempts to produce a drive with HDD costs and SSD performance (hopefully warning people away?) why can't you find some for flagship products from major manufacturers?
Billy Tallis - Friday, February 14, 2020 - link
The division of Samsung that manages the PM17xx products doesn't really do PR. We only got this drive to play with because MyDigitalDiscount wanted an independent review of the drive they're selling a few thousand of.The Samsung 983 DCT is managed by a different division than the PM983, and that's why we got to review the 983 DCT, 983 ZET, 883 DCT, and so on. But that division hasn't done a channel/retail version of Samsung's top of the line enterprise drive.
romrunning - Friday, February 14, 2020 - link
Too bad you don't get more samples of the enterprise stuff. I mean, you have both influencers, recommenders, and straight-up buyers of enterprise storage who read Anandtech.Billy Tallis - Friday, February 14, 2020 - link
Some of it is just that I haven't tried very hard to get more enterprise stuff. It worked okay for my schedule to spend 5 weeks straight testing enterprise drives because we didn't have many consumer drives launch over the winter. But during other times of the year, it's tough to justify the time investment of updating a test suite and re-testing a lot of drives. That's part of why this is a 4-vendor roundup instead of 4 separate reviews.Since this new test suite seems to be working out okay so far, I'll probably do a few more enterprise drives over the next few months. Kingston already sent me a server boot drive after CES, without even asking me. Kioxia has expressed interest in sampling me some stuff. A few vendors have said they expect to have XL-NAND drives real soon, so I need to hit up Samsung for some Z-NAND drives to retest and hopefully keep this time.
And I'll probably run some of these drives through the consumer test suite for kicks, and upload the results to Bench like I did for one of the PBlaze5s and some of the Samsung DCTs.
PandaBear - Friday, February 14, 2020 - link
ESSD firmware engineer here (and yes I have worked in one of the company above). Enterprise business are mostly selling to large system builder so Anandtech is not really "influence" or "recommend" for enterprise business. There are way more requirements than just 99.99 latency and throughput, and buyers tend to focus on the worst case scenarios than the peak best cases. Oh, pricing matters a lot. You need to be cheap enough to make it to the top 3-4 or else you lose a lot of businesses, even if you are qualified.RobJoy - Tuesday, February 18, 2020 - link
Well these are Intel owners here.Anything PCIe 4.0 has not even crossed their minds, and are patiently waiting for Intel to move their ass.
No chance in hell they dare going AMD Rome way even if it performs better and costs less.
romrunning - Friday, February 14, 2020 - link
This article makes my love of the P4800X even stronger! :) If only they could get the capacity higher and the pricing lower - true of all storage, though especially desired for Optane-based drives.curufinwewins - Friday, February 14, 2020 - link
100% agreed, it's such a paradigm shifter by comparison.eek2121 - Friday, February 14, 2020 - link
Next gen Optane is supposed to significantly raise both capacity and performance. Hopefully Intel is smart and prices their SSD based Optane solutions at a competitive price point.curufinwewins - Friday, February 14, 2020 - link
Ok, great stuff Billy! I know it wasn't really the focus of this review, but dang, I actually came out ludicrously impressed with how very small quantities of first gen optane on relatively low channel installments have such a radically different (and almost always in a good way) behavior to flash. Definitely looking forward to the next generation of this product.