Arm Announces The Mali-G78 GPU: Evolution to 24 Cores
by Andrei Frumusanu on May 26, 2020 9:00 AM ESTToday as part of Arm’s 2020 TechDay announcements, alongside the release of the brand-new Cortex-A78 and Cortex-X1 CPUs, Arm is also revealing its brand-new Mali-G78 and Mali-G68 GPU IPs.
Last year, Arm had unveiled the new Mali-G77 which was the company’s newest GPU design based on a brand-new compute architecture called Valhall. The design promised major improvements for the company’s GPU IP, shedding some of the disadvantages of past iterations and adapting the architectures to more modern workloads. It was a big change in the design, with implementations seen in chips such as the Samsung Exynos 990 or the MediaTek Dimensity 1000.
The new Mali-G78 in comparison is more of an iterative update to the microarchitecture, making some key improvements in the matter of scalability of the configuration as well as balance of the design for workloads, up to some more radical changes such as a complete redesign of its FMA units.
On the scalability side, the new Mali-G78 now goes up to 24 cores in an implementation, which is a 50% increase in core count compared to the maximum MP16 configuration of the Mali-G77. To date, the biggest configuration we’ve seen in the wild of the G77 was the M11 setup of the Exynos 990, with MediaTek employing an MP9 setup.
In a projected end-device solution comparison between 2020 and 2021 devices, Arm is projecting the new Mali-G78 to achieve 25% better performance, which includes both microarchitectural as well as process node improvements. That’s generally the reasonable target that vendors are able to achieve on newer generation IPs, but it’s also going to be strongly depending on the exact process node improvements that are projected here – as GPUs generally scale better with improves process density rather than just frequency and power improvements of the silicon.
At an ISO-process node under similar implementation area conditions, the Mali-G78 is claimed to improve performance density by 15%. This is referring to the either performing 15% better at the same area, or shaving off 15% area for the same performance, given that this can be done linearly by just adjusting the amount of GPU cores implemented.
Power efficiency sees a more meagre 10% improvement, which honestly isn’t too fantastic and not that big of a leap to the Mali-G77. ML performance is also said to be improved by 15% thanks to some new microarchitectural tweaks.
Seemingly, the Mali-G78 doesn’t look like too much of an upgrade compared to the vast new redesign we saw last year with the G77 – and in a sense, that does seem somewhat reasonable. Still, the G78 does some interesting changes to its microarchitecture, let’s dwell a bit deeper into what’s changed…
36 Comments
View All Comments
Iketh - Tuesday, May 26, 2020 - link
reading that was pulling teethhere's your paragraph:
"The async feature from an energy efficiency perspective is proclaimed to be around 6-13% depending on the workload. This is actually a bit of a more complex figure in my view. The main problem in my view is that to achieve this, the SoC vendor needs to actually go ahead and employ a second voltage rail for the GPU to gain the most benefit of the asynchronous frequencies. The efficiency benefit here is small enough, that it begs the question if it’s not just cheaper to add in a few more extra cores and lock them lower, rather than incurring the cost of the extra PMIC rail, inductors and capacitors. It’s an easy efficiency gain for flagship SoCs, but I’m really wondering what vendors will be deploying in the mid-range and lower."
here's the same paragraph cleaned up a bit:
For energy efficiency, the async feature claims to improve 6-13% depending on the workload. This seems difficult to implement in my opinion. The main problem is the SoC vendor needs to employ a second voltage rail for the GPU to see the biggest benefit of asynchronous frequencies. The efficiency benefit is small enough that it begs the question if it’s cheaper to simply add more cores and clock them lower rather than incurring the cost of the extra PMIC rail, inductors, and capacitors. It’s an easy efficiency gain for flagship SoCs, but I wonder what vendors will deploy in the mid and low range.
psyclist80 - Tuesday, May 26, 2020 - link
Thanks Teach, here's your apple! now 'eff off...find someone else to pick on to satisfy your egoCellar Door - Tuesday, May 26, 2020 - link
Actually - it is your reading comprehension, that is the issue here.Please, refrain from blaming others for it.
jjpaq - Tuesday, May 26, 2020 - link
Are you arguing that it's pointless to ever edit text for concision and interest?The extra commas sprinkled throughout your reply seem to make his point perfectly.
Spunjji - Thursday, May 28, 2020 - link
Are you familiar with the concept of a straw man? 🤦♂️dotjaz - Tuesday, May 26, 2020 - link
Nope, I can understand the original article effortlessly but it doesn't mean it's pleasant to read.mkozakewich - Wednesday, May 27, 2020 - link
No, he seemed noticeably more flustered. Must have been a deadline.Alistair - Tuesday, May 26, 2020 - link
Nice. It is nice to see simple and direct language.judithesanchez68 - Thursday, May 28, 2020 - link
Make money online from home extra cash more than $18k to $21k. Start getting paid every month Thousands Dollars online. I have received $26K in this month by just working online from home in my part time.every person easily do this job by just open this link and follow details on this page to get started… WWW.iⅭash68.ⅭOⅯSpunjji - Thursday, May 28, 2020 - link
I agree that the original paragraph could have been cleaned up, but I actually found yours not a lot better.