Today Marvell is announcing the release of its new next-generation OCTEON Fusion CNF95XX baseband processors, as well as introducing a new generation of OCTEON TX2 infrastructure processors. Together, the two products aim at enabling 5G infrastructure equipment providers in a synergistic fashion, greatly improving upon the capabilities of Marvell’s product stack.

OCTEON Fusion CNF95XX: A 5G Baseband Processor

We don’t tend to cover such announcements very often, so first things first is to define what a baseband processor actually is.

5G’s O-RAN (Open RAN or Open Radio Access Network) architecture is defined at different layers representing different functions of the protocol stack. The stack is divided into 3 layers, with the first layer being the physical layer (PHY), essentially handling encoding, modulation and RF processing related to actually transforming digital signals to the analog RF system. In essence, this is your “modem” part of the baseband system, and it’s here where Marvell’s new CNF95XX baseband processor fall in. We’ll be getting to layers 2 and 3 a bit later in the article.

The new OCTEON Fusion family is Marvell’s first baseband processor line-up designed with support for 5G capability, following several years of successful 3G and 4G products. It’s to be mentioned that Marvell is still a relative newcomer to the macro cell business with their first products, the CNF75XX, only being released back in 2015. Marvell is also one of the rare “merchant” providers of such solutions, as most other vendors in the industry largely rely on in-house designs.

Among the big network equipment provider names such as Huawei, Nokia, Ericsson and ZTE, Marvell’s big win here is being the silicon provider for Samsung’s 5G base station equipment, which Marvell says they’re in volume production now for today’s announced products.

Enabling the cellular 5G PHY capabilities of the chip are 40+2 DSP cores which handle the software-defined architecture of the subsystem. Alongside the custom VLIW DSP cores, we see a plethora of fixed function 4G/5G specific hardware accelerator blocks related to encoding and decoding of signals. The various blocks are interconnected by a high-speed fabric and the memory subsystem here is backed by a large 24MB shared memory cache. Interfacing to the radio units (RF) is enabled by 6x 25G SerDes links, enabling an industry standard RoE (Radio over Ethernet) / eCPRI connection.

On the control side of the chip, we find Marvell employing their in-house microarchitecture they’ve inherited from Cavium in the form of TX2 Arm v8.2 CPU cores. We don’t have too much information on the cores, but it does seem they do differ a bit from the IP used in the ThunderX2 server chip, as we see the cache configuration being divulged as being 66KB instruction cache and a 41KB L1D. Those are odd numbers; the L1D probably ends up as being 5-way associative, but I can’t figure out the instruction cache configuration as we haven’t had time to reach out to Marvell on this peculiarity. Another possibility is that they're counting parity and ECC bits to end up at the current figures.

There’s 6 of these CPU cores and they’re clocked in at 2.6GHz. On the CPU cluster side we see a shared 1.25MB MLC, essentially acting as an L2 to the CPUs, and the CPU complex is connected to a low-latency crossbar interconnect to the memory subsystem. Here we see an up to 3.5MB last-level cache in front of triple-channel PC-level 72b DDR4-2666 memory controllers (We have yet to confirm with Marvell the discrepancy between 2x and 3x channels in the slide deck).

Connecting the CNF95XX to the lower layers and networking are 4x 25G SerDes links, enabling Ethernet support for up to 100G, and there’s of course packet acceleration support.

Alongside offering complete merchant silicon solutions such as the CNF95XX, Marvell has an interesting business model where they are offering various levels of customisation to enable costumers to differentiate themselves in the products and to adhere to their exact needs. A semi-custom approach would be to have Marvell integrate a customer’s IP into a custom solution, but what’s even more interesting and disruptive here is that Marvell is open to licensing their OCTEON Fusion IP for customers to design their own full custom chips on their own. We don’t know if the likes of Samsung are doing this, but for Marvell to actually offer such a business model it means there must have been interest in the industry.

Putting things together, a baseband station as what you would see in the wild on cell towers would look like the above setup, with the CNF95XX sitting at the centre enabling the radio units, and paired with a layer 2 & 3 processor such as the TX2 CN92XX.

OCTEON TX2: A Infrastructure Processor For Networking

Sitting behind the baseband processor in 5G deployments, but also designed to be used in various other infrastructure and network applications, sits the new OCTEON TX2. The aim for such designs is relatively simple, but hard to achieve: move around and process enormous amount of data as fast as possible.

At the heart of the designs we see up to 5 100G media access controllers, enabling up to 200Gbps of data path throughput with the help of TX2 custom processors, scaling up from 4 cores to 36 cores in the biggest design.

The products aren’t just a fit for 5G deployments, but cover various other networking, compute and data-centre markets as inherently their design is flexible as to what kind of workloads they’re employed for.

We see specialised packet processors and hardware accelerators implemented in the design, helping power efficiency and performance of the solution. Here Marvell claims the chipsets are manufactured on a “leading-edge” process technology, but the company wouldn’t comment on the specifics of what this actually meant.

The new SKUS announced today actually include 4 SKU families, all differing in their processing power and throughput capabilities.

The lower-end CN913X system is a quite small and targets SOHO (small office / home office) as well as SMB (Small medium business) deployments, is comprised of 4 Cortex-A72 CPU cores, but still manages to deliver significant network throughput of up to 3x 10G Ethernet plus an additional 6x 1 / 2.5G connections, all in a small power package of only 9-14W.

Today’s more interesting products are the new CN92XX, CN96XX and CN98XX line-up, offering significant processing and throughput capabilities.

The chip designs here scale up to 36 processor cores, again using Marvell’s own TX2 microarchitecture Arm v8.2 CPUs. The shared MLC inside the CPU cluster goes up to 8MB in the 30-36 core CN98XX and up to 5MB in the CN92XX and CN96XX models. The memory subsystem also scales with the core count across the SKUs: the last-level cache ranges from 8MB, 14MB and 21MB, while the memory controllers also increase their count from 2-channel on the lower model to 3-channel to up to 6-channels on the biggest design, all offering DDR-3200 capability.

Marvell claims to have the highest SPECint_rate SoC in its class with the new design.

The most important part of the new products is their network throughput and acceleration capabilities. The CN96XX and higher offer 100G integrated Ethernet capability, in configuration of 3x 100G or 12x 25G, up to 5x 100G or 20x 25G.

Marvell also boasts a lot of packet processing throughput with IPSEC handling of up to 200Gbps in the highest tier.

The Competition

Marvell’s competition in the space here are vendors such as Intel which offer similar products. It’s to be noted that the following comparisons here are a bit outdated in the presentation as they’re positioned against Intel’s previous generation solutions which included combinations such as Xeon-D processors combined with dedicated NICs as well as FPGAs for packet processing – such solution have been now superseded by the recently announced “Snow Ridge” / Atom P5900 processors which essentially compete in the same target market as Marvell’s processor line-up.

However, Intel didn’t talk in detail about availability of its new Snow Ridge platform, while Marvell’s chips are in mass production and being deployed right now. In terms of performance comparisons to Intel’s previous gen solution, it’s a very one-sided battle in favour of Marvell, at significantly lower power. As a note, Intel still hasn’t published any figures on Snow Ridge TDPs.

Whilst again the comparison here will be outdated for later deployments in the year, it again shows Marvell’s advantage on the connectivity side by integrating packet acceleration as well as networking all in a single SoC, massively reducing power and total cost of the system.

Overall, Marvell is confident that they’re able to hold a leadership position in the market, and the Q2 upcoming CN98XX does seemingly scale beyond what competitors are able to offer. With 5G and cloud data-centres exploding over the coming years, we’ll be sure to hear a lot more of such products, and it’s definitely an interesting part of the industry.

Comments Locked

8 Comments

View All Comments

  • SarahKerrigan - Monday, March 2, 2020 - link

    Based on core count, no mention of multithreading, and cache sizes... I think these may actually be based on the original ThunderX2 developed by Cavium, not the Vulcan design acquired from Marvell.

    -66+41K L1 lines up better with the original 64+40K announced for Cavium-TX2 than with Vulcan's 32+32 caches.

    -Vulcan goes to 32 cores @ 2.2GHz in 180W, so 36 cores @ 120ish with a bunch of extra peripherals seems unlikely unless there's a shrink. Cavium's inhouse TX2 design targeted higher core counts than Vulcan.

    -No multithreading is mentioned. This is something of a headline feature for Vulcan.

    I think we're looking at a microarchitecture that was widely assumed dead.
  • SarahKerrigan - Monday, March 2, 2020 - link

    And, one more addition: Cavium TX2 (the original design, not Vulcan) was supposed to have a 32MB LLC for 54 cores. That scales exactly with what we're seeing for Octeon TX2 - 21MB LLC at 36 cores.

    At this point I think I'd put money on this being based on Cavium's original TX2 microarchitecture rather than Vulcan.
  • SarahKerrigan - Monday, March 2, 2020 - link

    Ugh, "Acquired from Broadcom", not Marvell. An edit button would be nice!
  • dotjaz - Thursday, March 5, 2020 - link

    That's pretty much confirmed. Based on GCC commit not the specs. Cavium is explicitly mentioned.

    /* Cavium ('C') cores. */
    AARCH64_CORE("octeontx2", octeontx2, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 0x43, 0x0b0, -1)
    https://github.com/gcc-mirror/gcc/blob/master/gcc/...
  • RallJ - Monday, March 2, 2020 - link

    > However, Intel didn’t talk in detail about availability of its new Snow Ridge platform, while Marvell’s chips are in mass production and being deployed right now.

    Marvell website says:

    > Marvell’s OCTEON TX2 CN9130, CN92xx and CN96xx are available now with reference designs and development kits. Marvell’s CN98xx will begin sampling in the second quarter of 2020.

    Clearly they won't be in mass production until 2021 timeframe at the earliest.
  • anonomouse - Tuesday, March 3, 2020 - link

    Any chance you’ll be able to reach out to Marvell and get some more details on the uarch of the cores used? As SarahKerrigan points out it seems likely to be a derivative of the original ThunderX uarch, which would be interesting as with all of the other Arm vendors seemingly going to all stock core IP Marvell is in the unusual position of fielding two different lines of custom uarch.

    And I’m guessing based on their reticence to talk about node that it’s not 7nm, probably Samsung 14nm or TSMC 16/12.
  • songhai.wang - Thursday, March 12, 2020 - link

    How about the capability of the CNF95xx of supporting MassiveMIMO as the major product in 5G?
  • saju_n - Wednesday, June 15, 2022 - link

    I have a question related to the architectural deployment of the Octeon Fusion.
    1. Small cell w/ Split 6
    Does Octeon Fusion support nFAPI termination, so it could interwork with any (ie non Octeon TX2) vendor's L2+ solution ?

    2. Macro cell w/ Split 7.2
    In this case, the Octeon Fusion + Octeon TX2 (any alternate) would be co-located, in the O-RAN defined O-DU. In this case, what is the connection btwtween the Fusion and TX2 - Is it Ethernet, OR, does it support any alternate, like a FAPI-over-PCIE (as in competing products) ?

Log in

Don't have an account? Sign up now