CPU Year In Review 2021: Efficient Cores Are The New Bling
by Dr. Ian Cutress on December 30, 2021 8:00 AM ESTAs far as most years ago, 2021 has been an up and down year when it comes to the desktop CPU market. At the beginning of the year, the best CPUs on the market were almost impossible to find, and when they were in stock, it was only above the suggested pricing. Now at the end of the year, processors are plentiful, but the needle has swung in the other direction when it comes to performance. Here’s a rundown of the fun year that 2021 has been.
Brief Recap: End of Year 2020
The rush at the end of Q4 in 2020 was quite substantial. In the space of those final three months, AMD went on a spree and launched the widest range of its 7nm products into the market. We’re talking graphics, mobile processors, ramping up enterprise CPUs, and for the purposes of this article, Zen 3-based Ryzen 5000 desktop processors. Ryzen 5000 took the crown as the best processor on the market, up to 16 cores, with a new high-performance microarchitecture. The problem was that they were hard to get hold of, for a long time.
AMD wasn’t the only one in this situation though. The global semiconductor shortages, which most were tied to packaging issues relating to high-performance films, meant that both Intel Comet Lake 10th Gen CPUs and AMD Ryzen 5000 CPUs and all the GPUs were gold dust. Looking for a CPU at the time meant being lucky at a store, or having watcher bots online.
Thankfully, the situation got better through 2021.
The Early Interviews from Intel and AMD
First up, the initial 2021 news for us at AnandTech was a pair of roundtable interviews. Between AMD CEO Dr. Lisa Su and then-CEO of Intel Bob Swan, we had the opportunity to probe both about the future of each company as well as the situation of the semiconductor supply shortage.
Dr. Su reiterated AMD’s commitment to x86, along with investing in its own supply chain to ensure the future availability of all the 7nm products it had just released. There was also mention of the Xilinx deal, the large amount of design wins on Ryzen Mobile for notebooks and laptops, and also the enterprise EPYC platform.
With Swan, it was a different sort of interview – with hindsight Bob had a foot out of the door, but we didn’t know that at the time. He focused on Intel keeping its fabs, outsourcing chips, and sub-licensing process technology – basically an early version of new CEO Pat Gelsinger’s IDM 2.0 strategy.
While neither of those interviews are directly CPU news, it’s worth highlighting them.
2021 Q1: January through March
Sitting here at the end of 2021, with Intel’s 12th Generation sitting on my desk, it’s hard to believe that 12 months ago, Intel was still serving us their 10th Generation platform. The first CPU review out of the gate in the New Year was our review of Intel’s Core i9-10850K. This was a 10-core Comet Lake processor, built on the Skylake++++ platform, that Intel created because its Core i9-11900K was too high a frequency to bin enough hardware to provide the number of units demanded by the ecosystem (or at least that’s the common theory). The 10850K was similar to the 10900K but slightly lower frequency, for around the same power, and slightly cheaper. It became the preferred Intel CPU for a while, just because it was more widely available and Intel could make more of them. It still trailed behind AMD’s latest, but both were hard to acquire.
On the mobile side, we reviewed AMD’s Ryzen 9 5980HS. This is the flagship 35 W eight-core processor for the current generation of AMD’s thin-and-light but gaming-focused notebooks. We tested the ROG Flow X13, a 13-inch design paired with a GTX 1650 discrete graphics inside with a 3840x2400 display at 120 Hz and the premium of everything for that class of device. One of the criticisms of AMD’s mobile platform to this point was the lack of premium experiences for thin-and-light but powerful designs, and this sort of design was a strong design win for AMD. The question later down the line is if/when AMD could pair its own mobile CPUs to its own mobile GPUs.
As we crossed from February to March, more AMD news came virtue of the Threadripper Pro 3000 family, with up to Zen 2 cores coming to the workstation market. While it launched in mid-2020, Lenovo had the exclusive, and it was around this time they finally made it on shelves for everyone to use. We tested the Lenovo hardware with the big 64-core 3995WX, but more end-user hardware for the 32-core and 24-core parts. TR Pro offered the performance of Threadripper, but full 128 lanes of PCIe, eight lanes of memory, and ECC support.
Also in February, Pat Gelsinger became Intel's CEO. He quickly hosted in March an event that outlined Intel's future.
Deeper into March, and Intel launches its new 11th Generation Rocket Lake processors. These were interesting parts for the industry, as Intel had retrofitted its 10nm processor cores for its latest generation 14nm, as the latter at the time was offering better frequency characteristics for a high-end platform. The response was lukewarm, given these processors were themselves quite power-hungry, and it was clear that in order to be competitive, Intel had pushed the frequency and power so far outside the optimization point it resulted in strong thermals. The introduction of AVX-512 to the platform, a technology that requires lots of power, did not help the situation. Not only this, but Intel was still behind AMD’s latest generation in performance.
To round out March, AMD launched its third generation EPYC enterprise processors. Featuring the same Zen 3 cores as the Ryzen 5000 family, the main plus point here was the new core design coupled with a slight uptick in frequency at the high-end and a re-arranging of the range of products to better fit AMD’s new market delineations. AMD went after the per-core performance with the F series in a big way. When we were able to test on retail systems, we saw an additional +10% performance, showcasing optimizations within the lifespan of the product. With the third generation EPYC, purely from a CPU performance perspective, AMD has been able to dictate where its processors go, rather than fit the mold of the market leaders.
2021 Q2: April through June
After a busy Q1, we saw the market quieten down in Q2. A lot of major processors had been launched, and the grip of the semiconductor space was taking hold. CPUs were one of the first to get to some normality in availability, at least for end-users, but it wouldn’t be until Q3 when everything was more or less available at any time. The issue here is that users rarely buy a CPU in isolation – it comes with a graphics card, memory, storage, a motherboard, and perhaps a display. Some of these markets, particularly graphics and displays, have had bigger issues than CPUs.
Nonetheless, Intel lifted the lid on its next-generation Xeon platform in April. The 3rd Generation Xeon Scalable platform was Intel’s first high-powered CPU product built on 10nm, offering up to 40 cores. The Ice Lake architecture itself was almost two years old, having been used in laptops, but here it was in big beefy silicon with lots of memory and available in systems with up to 2 sockets. Intel’s goal here with Xeon is to offer complete solutions – CPU, storage, networking, memory. Overall, we found the hardware generationally impressive, an uplift from 2nd Gen, with an increase in performance per watt and raw performance. The positives mirrored those we saw with Ice Lake laptops in August 2019. However the downsides were also there – they are power-hungry, even if performance per watt is higher. Beyond that, they also fall behind the competition in raw performance metrics – in order to push an Intel system ahead, software has to be optimized for it, or use dedicated onboard Intel accelerators. Intel piled this on, making sure lots of industry-standard software was optimized, but in Intel’s recent financial calls, the demand for the high-end 40 core parts wasn’t really there, with customers preferring to stay around the 28-core. There are suggestions that this might be because Intel’s yields on the 40-core are low, however it will be interesting to see how strong Intel will suggest its 2022 platform, 4th Gen Sapphire Rapids, as a direct replacement to those sticking on 1st/2nd gen hardware.
2021 Q3: July through September
Moving into the second half of the year, movement in the CPU space was slow. What was perhaps more active were rumors of upcoming processors. Big on the horizon was Intel’s Alder Lake desktop processors, teased in January, set to come end of the year. Just before Q3, AMD had previewed the performance of its upcoming V-Cache processors, promising more gaming performance. Then there was also talk of what Apple would be doing next beyond its new M1 silicon. The only real launch in Q3 was from AMD.
Following on from previous generations of Ryzen Mobile, the launch of the desktop-focused APUs came around nine months later. The Ryzen 5000G family of desktop processors were simply the mobile parts converted into the AM4 form factor, offering up to 8 cores and Vega 8 graphics. Aside from adjustments in frequency given that desktops are better for 65 W chips, the hardware inside is identical. A common thought is that AMD’s margins on desktop APUs is lower than most of its other products, and so there wasn’t a major rush to make these widely available – AMD didn’t even make the Ryzen 3 version for retail and only for system integrators. It wasn’t until a few weeks after launch that we were able to find the Ryzen 5 and Ryzen 7 easily for purchase at the regular retailers, but the 5700G offered a good intro into users building a new system without a graphics card yet, while the GPU hardware has been at a severe demand for the last 18 months.
2021 Q4: October through December
At the tail end of the year, we ramped up the launches again, with the rush to get something out into the market ahead of the holiday season. Trying to get a CPU launch to synchronize with a holiday season is witchcraft, let’s be clear.
In October, Apple launched new MacBook Pro laptops with its M1 Pro and M1 Max custom silicon. Extended from its first M1 chip in 2020, the M1 Pro and M1 Max extended the idea with more cores, more cache, more memory, more graphics, and higher power budgets for 14-inch and 16-inch laptops. Apple discontinued its relationship with Intel for this line of products to bring in its own silicon, giving it control of performance and acceleration features that Intel could not deliver in a timely manner. Losing the contract has been a blow to Intel, but for Apple customers that can use the features under the hood, the M1 Pro and M1 Max are blisteringly fast. In some cases, you have to bring out server hardware to beat them.
In our performance tests, the M1 Pro and M1 Max change the narrative completely regarding custom silicon and vertical integration with the OS. This hardware was built with power users in mind, and the updated MBP with lots of ports reflects that Apple understands it has a market here that it can own. For those that have the workflow, Apple’s hardware works – the biggest barrier though is (a) the price, and (b) a number of enthusiasts are reluctant to be sucked into the Apple ecosystem, especially if their workflow doesn’t benefit. Gaming is also an afterthought really, until Apple designs its own discrete GPUs.
At the end of October, we participated in a group roundtable interview of Intel CEO, Pat Gelsinger.
Back in the world of desktop silicon, Intel launched its 12th Generation Alder Lake processors in November, marking the first time in a long while that Intel has launched two generations in the same year. Alder Lake marked several significant shifts in Intel’s strategy: the use of its Intel 7 process node (renamed from 10ESF), but also a hybrid architecture combining performance cores and efficiency cores, much like smartphone chips and those from Apple in the M1 family. The idea here is to offer performance when needed, and high efficiency for background operations. This requires strong collaboration with Microsoft as well, as this hybrid design is tightly coupled to the operating system – this is a collaboration that requires long-term management and expertise.
Intel has only launched the high-performance parts so far, the i9-K, i7-K, and i5-K, with the rest to come in early 2022. Alder Lake also brings Intel to DDR5, PCIe 5.0, and the new technologies that will dominate desktop computing for the rest of the decade. The hardware pulled ahead of AMD’s Zen 3 processors on single-threaded and gaming tasks, although lagged behind in throughput. Alder Lake is also suffering from a limited DDR5 market, making it difficult for end-users to buy the latest memory, instead having to rely on the DDR4 market given the platform can support both.
Alder Lake was seen as a win for new Intel CEO Pat Gelsinger, offering key performance in more realistic scenarios than AMD, at a similar or better price point – at least for the CPU. The only thorn in Intel’s side is that the 12th Gen platform is more expensive to AMD, and some reviews pointed to prosumer workload power as something that Intel needs to work on. The big excitement is going to come when Intel showcases the rest of the Alder Lake family in the New Year
2021 Q5: Next Generations
Aka 2022 Q1
As we look into the New Year, we start with the annual CES show, where AMD, Intel, NVIDIA, and Qualcomm have keynote presentations, all on January 4th. We know somewhat of what we should be expecting.
AMD: The first question on everyone’s lips is asking where the Zen 3 processors with V-Cache are. AMD already announced EPYC with V-Cache to come later in 2022, but it was always expected that the desktop processors would come first, so that’s what everyone wants to hear about. As Intel has taken the gaming crown, enthusiasts are interested to see if AMD can take it back with this new stacking technology.
Beyond V-Cache, CEO Dr. Lisa Su has already stated that we should be expecting Zen 4 in 2022, although most people are putting that towards the end of the year, coupled with the EPYC Zen 4 release which has already been announced as being in the second half. We’re not sure if AMD will address anything regarding Zen 4, but fingers crossed.
The other piece of the puzzle is what AMD is doing with Threadripper. We’re six months from when we thought a Zen 3 based Threadripper would have been launched, so we’re hoping that something is mentioned. Even if it’s to say that Threadripper has been put to pasture and Threadripper Pro is the future, then a Zen 3 based Threadripper Pro announcement would be welcome.
Intel: As previously mentioned, we should see the rest of Intel’s Alder Lake desktop platform be discussed, perhaps announced, and we get to see the cheaper offerings for the 12th Gen platform. Beyond the desktop, we’re also looking forward to the notebook processors, to date discussed in unconfirmed leaks as Alder Lake-P. Intel has already discussed the silicon that will go into the laptops from a high-level core count overview, but exact speeds, feeds, core counts, graphics, and pricing are still waiting to be filled in.
CEO Pat Gelsinger with a Ponte Vecchio HPC Accelerator
While not CPU-related, Intel has promised to bring its own GPUs to market in Q1. In a sufficient quantity, and at the right price, this could accelerate the sales of the CPUs currently in the market. There have been suggestions that Intel has lots of quantity, despite manufacturing on TSMC N6, but there is a question mark over how the drivers will be on day one, along with exactly when they will launch. Here’s hoping for more detail at CES.
Longer-term in 2022, next-gen Intel desktop CPUs have been rumored for the end of the year – Raptor Lake as a minor update to Alder Lake, before we finally move onto Intel 4 and an EUV manufacturing process in 2023. I’m not sure if Intel will mention any of these at CES though.
88 Comments
View All Comments
mode_13h - Thursday, December 30, 2021 - link
Something about Amazon's Graviton 3 and its use of Neoverse V1 cores deserves a mention....and speaking of ARM and efficient cores, the absence of a successor to the Cortex-A35 springs to mind. In fact, given how the X-series cores seem to have demoted the A700-series to mid-tier, should the A510 really be seen as something more than the A35's de facto successor?
nandnandnand - Thursday, December 30, 2021 - link
Pretty sure I already made the 5 tier joke in reply to you.Yeah, who knows. I think phones could definitely add an A35 successor for standby alongside Cortex-X2, A710 and A510, but I don't remember seeing any recent SoCs that had any A35 cores anyway. Maybe they don't get reported on as much. I do see that the Snapdragon Wear 5100 uses 4x Cortex-A53 for some strange reason.
mode_13h - Friday, December 31, 2021 - link
> Pretty sure I already made the 5 tier joke in reply to you.Yeah, and it kinda reminds me of the way automobiles were adding ever more speeds to their transmissions, so that the engine could always operate near the optimal efficiency-point.
The most tiers of cores in a single SoC that seem justified would probably be 3. The top & mid tiers should handle latency-sensitive workloads, while the bottom tier is useful for background tasks and idle-mode operation.
Kangal - Saturday, January 1, 2022 - link
I agree.But the efficiency and performance of the Cortex-A53 is pretty disappointing for today's standards. With the newer A55 and the A510 being not much better. The Cortex-A35 was pretty awesome in its hey-day as it offered a new 64bit-platform, high efficiency, for those low-power needs.
So I can actually see something like a miniaturised A510 on an 8nm, being a spiritual successor to the old 16nm A35 standing. What we really need is a much better "small core". Something that's OoO, small, and highly efficient. For instance, a Cortex-A73, updated for ARMv9 protocol, and miniaturised on 4nm for phones. Only then can Android devices begin competing properly against iOS devices. Since the lack of need to flip through both core types will save a lot of energy and latency times.
We also need a revolution on the large cores, since Apple's A13-P cores are still ahead of the latest X2 cores. I think we'll pass that threshold with the European X3 cores (second-gen ARMv9). I'd also add that the 4+4 design isn't that efficient, as the third and fourth large cores use up lots of power without being necessary contributing much computing. And we should skip the (1+3)+4 design too. We should just go with a 3+5 design instead, and either use three X3 cores if they're efficient enough, or use it's derivative (eg Cortex-A730) cores but with more cache and thermal headroom.
mode_13h - Saturday, January 1, 2022 - link
You start by saying you agree, but then you seem to argue for the opposite.I think the lowest tier cores should be optimized for energy-efficiency, period. They only need to be fast enough and supplied in quantities big enough to complete most of the background processing in a reasonable amount of time. This is purely about battery life.
I'd speculate that Apple is only using 2-tiers because they lack bandwidth in their design team to do a 3rd tier.
Kangal - Sunday, January 2, 2022 - link
I agree that the:A510 is the (spiritual) successor to the Cortex-A35, which was the successor to the Cortex-A7.
The original Cortex-A53 was great for its time in 2014, was quiet obsolete and lacklustre in 2017 or beyond. For all practical purposes, we didn't actually get a true successor. The A55 is a joke and so is the A510, if we are talking about using them as "small cores" in phones (excluding cheap budget/entry level).
We don't need more "speed tiers". Our software isn't advanced enough, and our use cases don't warrant it. It is far more ideal to reduce latency, and having a more simplified system. That's the route Apple takes, and they've been leading the market since the iPhone 5S in 2013. The Hypothetical SoC that I listed above is far better than your Hypothetical SoC with three or four different cpu types. For instance:
(3+5) X1-A73.... versus.... (1+3+3+1) X1-A78-A55-A35
(3+5) A78-A73.... versus... (4+2+2) A78-A55-A35
mode_13h - Sunday, January 2, 2022 - link
> We don't need more "speed tiers". Our software isn't advanced enough,> and our use cases don't warrant it.
That's nonsense. The OS knows which app(s) is/are running in the foreground. That's enough to prioritize their threads (at least, when the screen is unlocked) to run on the faster cores.
> That's the route Apple takes
Just because apple went with 2 tiers doesn't make it the right thing to do, you know? They could have other reasons than what's technically best.
I know you wish ARM would just design Apple-grade cores and you think that would be good enough. I'm trying to think about the problem in the abstract, and beyond simply wishing that ARM was more like Apple.
> your Hypothetical SoC with three or four different cpu types.
WTF? I clearly said, in 2 different posts, that I thought 3 tier was ideal.
Kangal - Monday, January 3, 2022 - link
Well, it's not non-sense and you're wrong. There is latency when shuffling processes from one core or complex to another. Some Apps don't require much performance, but then you run it, and later they require it. Running Apps always on the fast cores is not the solution, and nor is the opposite. So what we have right now, is the scheduler usually goes into performance mode when you interact with the screen or start up an App. Then it understands which category it needs to go into. Then it moves the process into that complex. And after a minute, it might migrate that thread back into the large cores. This behaviour is very common in something like your Web Browser.Why do you think the Helio X30 was such a flop?
Apple's philosophy is totally different. Remember these guys invented/popularised the Personal Computer. They have their own language, their own software, their own user interface, and their own hardware. And decades of experience there, but without the baggage of legacy code in iOS. So there's a big difference compared to Google.
I didn't state that ARM should just get Apple's cores. When you think about it, there's no magic in their cores. Just take the 16nm Cortex-A73, then miniaturise it in a 4nm node, modernise its feature-set without making it more complex, make it run from low-to-mid voltage, voila. Sure, it might use more power than Apple's E-cores AND it might run slower too. But it will be more inline with the current software requirements and performance demands of the main/small cores on a phone. And that translates to both a better experience than the current A53 cores, and overall better efficiency. And for the large cores, grab the Cortex-A78, add more cache and extra branch predictions, there you have an X1 core. Not as good as Apple's P-cores, but it is acceptable for its time.
Large X1 - Mini 73 (3+5) design, will run circles around the QSD 888's design of X1, A78, A55 (1+3+4) processor. It might not show it on the synthetic benchmarks, but in real-world use, it will be faster AND use less energy. And the idea of adding more "speed tiers" is just going to exasperate that.
" WTF? I clearly said, in 2 different posts, that I thought 3 tier was ideal. "
sigh. Now I know you're discussing this in bad-faith. Since you can clearly read that I said THREE or four. I did clearly read your comments, and others. Case in point, you have not commented on the comparison I made:
(3+5) A78-A73.... versus... (4+2+2) A78-A55-A35
mode_13h - Tuesday, January 4, 2022 - link
> There is latency when shuffling processes from one core or complex to another.Maybe a microsecond or so. As long as it's fairly infrequent, it's negligible.
What's costly, in terms of power, is to take a core out of sleep. And running background jobs means repeatedly doing exactly that. So, it's not only the energy used during execution that you need to think about. Therefore, background and I/O-limited threads should run on the smallest and most-efficient core possible.
> Some Apps don't require much performance, but then you run it,
> and later they require it. Running Apps always on the fast cores is not the solution,
> and nor is the opposite.
That's too simplistic, of course. It's more like this: when a thread is using its entire timeslice and it's part of a foreground app, then boost the clock speed of the core or move the thread to a more powerful core. If the thread is part of a backgrounded app or service, demote its priority, so that it'll tend to stay on the most efficient cores, even if it's compute-bound. Then, if it holds a resource blocked on by a higher-priority thread, give it the priority of the blocked thread.
Even that is a bit over-simplified, but the point is that this isn't an intractable or even a new problem. Boosting the priority of threads belonging to foreground apps is a technique that's been used for decades, on various operating systems.
> Why do you think the Helio X30 was such a flop?
I don't know anything about it, but a product can have a whole range of different issues that undermine it, such as thermal issues, inappropriate cache configuration, or even memory controller bugs that hamper performance.
> Apple's philosophy is totally different.
You don't actually know why they do what they do. To interpret their decisions without that knowledge is a bit like looking at chicken entrails. Nobody knows exactly what they mean, so they're completely open to interpretation.
> Remember these guys invented/popularised the Personal Computer.
No, they aren't. These are different people working for a company which has undergone 4+ decades of evolution. Different people; different company (effectively); very different product.
> They have their own language, their own software, their own user interface,
> and their own hardware.
So does Microsoft. And Google, even.
> but without the baggage of legacy code in iOS.
Oh, I bet there is!
> I didn't state that ARM should just get Apple's cores.
But you want to blindly follow Apple's strategy. I'm arguing that 3-tier is better, and you haven't given a good argument why it's not (except that Apple doesn't do it).
> Now I know you're discussing this in bad-faith.
> Since you can clearly read that I said THREE or four.
I never said four. That's not bad faith on *my* part. Don't make it sound like I'm saying something I'm not, and then we won't have a problem.
> Case in point, you have not commented on the comparison I made:
What do you want me to say? I'm not obligated to weigh in on a point I don't have data or the confidence to take a position on. If you want to talk about *specific* core configurations, that is a very data-driven argument, and also depends on cache + DRAM configuration.
All I'm saying is that I'd have probably 2 cores that are strictly optimized for energy-efficiency and nothing else.
movax2 - Thursday, December 30, 2021 - link
As nandnandnand says, there aren't many SoC based on A35. ( and I did laugh loud on his 5 tier joke x) . Who knows maybe we will see 4- or 5-tiers.Mediatek has tried to push A35 but SoC based on it weren't popular.
It's seems like A35 maked sense at 28nm node where A53 wasn't enough power efficient.
On 10 nm and newer, battery impact on A35 and A53 should be very similar.