The Memblaze PBlaze5 C916 Enterprise SSD Review: High Performance and High Capacities
by Billy Tallis on March 13, 2019 9:05 AM ESTThe biggest players in the enterprise SSD market are all familiar names for followers of the consumer side of the SSD market: As it turns out, Samsung, Intel, Micron, and the other vertically-integrated NAND flash memory manufacturers are the biggest players in both the enterprise and consumer SSD markets. Subverting this trend, however, the second-tier brands for the consumer and enterprise markets have very little overlap. This occurs despite the fact that the business models between the second-tier brands are actually quite similar. Either way, second-tier fabless SSD manufacturers for both markets base their businesses around buying NAND and SSD controllers from higher tier providers, build their own drive around those components. With drives based on commodity hardware, these fabless firms rely on custom firmware to differentiate their products.
A prime example of one of these fabless companies – and the subject of today's review – is Beijing-based Memblaze. The company has made a name for itself in the enterprise space over several generations of high-end NVMe SSDs, starting in 2015 with their PBlaze4. In 2017 they released the first round of PBlaze5 SSDs, which moved to Micron's first-generation 3D TLC NAND.
Most recently, late last year a new generation of PBlaze5 SSDs with Micron's 64-layer 3D TLC began to arrive. Today we're looking at the flagship from this latest generation, the PBlaze5 C916 6.4TB SSD. In addition to its large capacity, this drive features a PCIe 3.0 x8 interface that allows for speeds in excess of 4GB/s, and a high write endurance rating that makes it suitable for a broad range of workloads. The only way to get significantly higher performance or endurance from a single SSD is to switch to something that uses specialized low-latency memory like Intel's 3D XPoint, but their Optane SSDs only offer a fraction of the capacity.
Memblaze PBlaze5 AIC Series Comparison | |||||||||
C916 | C910 | C900 | C700 | ||||||
Controller | Microsemi Flashtec PM8607 NVMe2016 | ||||||||
NAND Flash | Micron 64L 3D TLC | Micron 32L 3D TLC | |||||||
Capacities | 3.2-6.4 TB | 3.84-7.68 TB | 2-8 TB | 2-11 TB | |||||
Endurance | 3 DWPD | 1 DWPD | 3 DWPD | 1 DWPD | |||||
Warranty | Five years |
The PBlaze5 family of enterprise SSDs are all relatively high-end, but the product line has broadened to include quite a few different models. The models that start with C are PCIe add-in cards, with PCIe 3.0 x8 interfaces that allow for higher throughput than the PCIe x4 links that most NVMe SSDs are limited to. The models that start with D are U.2 drives that support operation as either a PCIe x4 device or dual-port x2+x2 for high availability configurations. Memblaze offers models in two endurance tiers: 1 or 3 drive writes per day, reflecting the trend away from 5+ DWPD models as capacities have grown and alternatives like 3D XPoint and Z-NAND have arrived to serve the most write-intensive workloads.
The add-in card models are more performance-focused, while the U.2 lineup includes both the highest capacities (currently 15.36 TB), as well as some models designed for lower power and capacities so that a thinner 7mm U.2 case can be used.
Most of the PBlaze5 family uses the Microsemi (formerly PMC-Sierra) Flashtec NVMe2016 controller, one of the most powerful SSD currently controllers on the market. The 16-channel NVMe2016 and the even larger 32-channel NVMe2032 face little competition from the usual suppliers of SSD controllers for the consumer market, though in the past year both Silicon Motion and Marvell have announced 16-channel controller solutions derived from the combination of two of their 8-channel controllers. Instead, the competition for the NVMe2016 comes from the largest in-house controllers developed by companies like Samsung, Intel and Toshiba, as well as Xilinx FPGAs that are used to implement custom controller architectures for other vendors. All of these controller solutions are strictly for the enterprise/datacenter market, and are unsuitable for consumer SSDs: the pin count necessary for 16 or more NAND channels makes these controllers too big to fit on M.2 cards, and they are too power-hungry for notebooks.
Micron's 64-layer 3D TLC NAND has consistently proven to offer higher performance than their first-generation 32L TLC, but Memblaze isn't advertising any big performance increases over the earlier PBlaze5 SSDs. Instead, they have brought the overprovisioning ratios back down to fairly normal levels after the 32L PBlaze5 drives. Those drives were rated for 3 DWPD, and as a result kept almost 40% of their raw flash capacity as spare area. The PBlaze C916 with 64L TLC, on the other hand, reserves only about 27% of the flash as spare and suffers only a slight penalty to steady-state write speeds, and no penalty to rated endurance. (For comparison, consumer SSDs generally reserve 7-12% of their raw capacity for metadata and spare area, and are usually rated for no more than about 1 DWPD.)
Our 6.4TB PBlaze5 C916 sample features a total of 8TiB of NAND flash in 32 packages each containing four 512Gb dies. This makes for a fairly full PCB, with 16 packages each on the front and back. There is also 9GB of DDR4 DRAM on board, providing the usual 1GB per TB, plus ECC protection for the DRAM.
Memblaze PBlaze5 C916 Series Specifications | |||||||
PBlaze5 C916 | PBlaze5 C900 | ||||||
Form Factor | HHHL AIC | HHHL AIC | |||||
Interface | PCIe 3.0 x8 | PCIe 3.0 x8 | |||||
Controller | Microsemi Flashtec PM8607 NVMe2016 | ||||||
Protocol | NVMe 1.2a | ||||||
DRAM | Micron DDR4-2400 | ||||||
NAND Flash | Micron 512Gb 64L 3D TLC | Micron 384Gb 32L 3D TLC | |||||
Capacities (TB) | 3.2 TB | 6.4 TB | 2 TB | 3.2 TB | 4 TB | 8 TB | |
Sequential Read (GB/s) | 5.5 | 5.9 | 5.3 | 6.0 | 5.9 | 5.5 | |
Sequential Write (GB/s) | 3.1 | 3.8 | 2.2 | 3.2 | 3.8 | 3.8 | |
Random Read (4 kB) IOPS | 850k | 1000k | 823k | 1005k | 1010k | 1001k | |
Random Write (4 kB) IOPS | 210k | 303k | 235k | 288k | 335k | 348k | |
Latency Read (4kB) | 87 µs | 93 µs | |||||
Latency Write (4kB) | 11 µs | 15 µs | |||||
Power | Idle | 7 W | |||||
Operating | 25 W | ||||||
Endurance | 3 DWPD | ||||||
Warranty | Five years |
Diving into the performance specs for the PBlaze5 C916 compared to its immediate predecessor, we see that the 6.4TB C916 should mostly match the fastest 4TB C900 model, but steady-state random write performance is rated to be about 10% slower. The smaller 3.2TB C916 shows more significant performance drops compared to the 3.2TB C900, but in terms of cost it is better viewed as a replacement for the old 2TB model. Random read and write latencies are rated to be a few microseconds faster on the C916 with 64L TLC than the C900 with 32L TLC.
The C916 is rated for the same 7W idle and 25W maximum power draw as the earlier PBlaze5 SSD. However, Memblaze has made a few changes to the power management features. The 900 series included power states to limit the drive to 20W or 15W, but the 916 can be throttled all the way down to 10W and provides a total of 16 power states to allow for the limit to be tuned in 1W increments between 10W and 25W. We've never encountered a NVMe SSD with this many power states before, and it seems to be a bit excessive.
Ultimately the lower power states don't make much sense for the C916 because most PCIe x8 slots have no trouble delivering 25W and enough airflow to cool the drive. However, the D916 in the U.2 form factor is harder to cool, and the configurable power limit may come in handy for some systems. So for this review, the C916 was run through the test suite twice: once with the default 25W power state, and once in the lowest 10W limit state to see what workloads are affected and how the drive's QoS holds up during throttling.
In addition to its flexible power management, the PBlaze5 supports several of the more advanced NVMe features that are often left out on entry-level enterprise drives. The drive supports 128 NVMe queues, so all but the largest servers will be able to assign one queue to each CPU core, allowing IO to be performed without core to core locking or synchronization. Many older enterprise NVMe SSDs we have tested are limited to 32 queues, which is less than optimal for our 36-core testbed. To complement the dual-port capability of the U.2 version, the firmware supports multipath IO, multiple namespaces, and reservations to coordinate access to namespaces between different hosts connected to the same PCIe fabric. The PBlaze5 C916 does not yet support features from NVMe 1.3 or the upcoming 1.4 specification.
13 Comments
View All Comments
Samus - Wednesday, March 13, 2019 - link
That. Capacitor.Billy Tallis - Wednesday, March 13, 2019 - link
Yes, sometimes "power loss protection capacitor" doesn't need to be plural. 1800µF 35V Nichicon, BTW, since my photos didn't catch the label.willis936 - Wednesday, March 13, 2019 - link
That’s 3.78W for one minute if they’re running at the maximum voltage rating (which they shouldn’t and probably don’t), if anyone’s curious.DominionSeraph - Wednesday, March 13, 2019 - link
It's cute, isn't it?https://www.amazon.com/BOSS-Audio-CPBK2-2-Capacito...
takeshi7 - Wednesday, March 13, 2019 - link
I wish companies made consumer PCIe x8 SSDs. It would be good since many motherboards can split the PCIe lanes x8/x8 and SLI is falling out of favor anyways.surt - Wednesday, March 13, 2019 - link
I bet 90% of motherboard buyers would prefer 2 x16 slots vs any other configuration so they can run 1 GPU and 1 very fast SSD. I really don't understand why the market hasn't moved in this direction.MFinn3333 - Wednesday, March 13, 2019 - link
Because SSD's have a hard time saturating 4x PCIe slots, 16x would just take up space for no real purpose.Midwayman - Wednesday, March 13, 2019 - link
Maybe, but it sucks that your GPU gets moved to 8x. 16/4 would be an easier split to live with.bananaforscale - Thursday, March 14, 2019 - link
Not really, GPUs are typically bottlenecked by local memory (VRAM), not PCIe.Opencg - Wednesday, March 13, 2019 - link
performance would not be very noticeable. and even in the few cases it would be, it would require more expensive cpus and mobos thus mitigating the attractiveness to very few consumers. and fewer consumers means even higher prices. we will get higher throughput but its much more likely with pci 4.0/5.0 than 2 16x