AMD Kaveri Docs Reference Quad-Channel Memory Interface, GDDR5 Option
by Anand Lal Shimpi on January 16, 2014 10:51 PM ESTOur own Ryan Smith pointed me at an excellent thread on Beyond3D where forum member yuri ran across a reference to additional memory controllers in AMD's recently released Kaveri APU. AMD's latest BIOS and Kernel Developer's Guide (BKDG 3.00) for Kaveri includes a reference to four DRAM controllers (DCT0 - 3) with only two in use (DCT0 and DCT3). The same guide also references a Gddr5Mode option for each DRAM controller.
Let me be very clear here: there's no chance that the recently launched Kaveri will be capable of GDDR5 or 4 x 64-bit memory operation (Socket-FM2+ pin-out alone would be an obvious limitation), but it's very possible that there were plans for one (or both) of those things in an AMD APU. Memory bandwidth can be a huge limit to scaling processor graphics performance, especially since the GPU has to share its limited bandwidth to main memory with a handful of CPU cores. Intel's workaround with Haswell was to pair it with 128MB of on-package eDRAM. AMD has typically shied away from more exotic solutions, leaving the launched Kaveri looking pretty normal on the memory bandwidth front.
In our Kaveri review, we asked the question whether or not any of you would be interested in a big Kaveri option with 12 - 20 CUs (768 - 1280 SPs) enabled, basically a high-end variant of the Xbox One or PS4 SoC. AMD would need a substantial increase in memory bandwidth to make such a thing feasible, but based on AMD's own docs it looks like that may not be too difficult to get.
There were rumors a while back of Kaveri using GDDR5 on a stick but it looks like nothing ever came of that. The options for a higher end Kaveri APU would have to be:
1) 256-bit wide DDR3 interface with standard DIMM slots, or
2) 256-bit wide GDDR5 interface with memory soldered down on the motherboard
I do wonder if AMD would consider the first option and tossing some high-speed memory on-die (similar to the Xbox One SoC).
All of this is an interesting academic exercise though, which brings me back to our original question from the Kaveri review. If you had the building blocks AMD has (Steamroller cores and GCN CUs) and the potential for a wider memory interface, would you try building a high-end APU for the desktop? If so, what would you build and why?
I know I'd be interested in a 2-module Steamroller core + 20 CUs with a 256-bit wide DDR3 interface, assuming AMD could stick some high-bandwidth memory on-die as well. More or less a high-end version of the Xbox One SoC. Such a thing would interest me but I'm not sure if anyone would buy it. Leave your thoughts in the comments below, I'm sure some important folks will get to read them :)
127 Comments
View All Comments
Madpacket - Thursday, January 16, 2014 - link
Yes being able to to use 4 DIMMS to drive 256 bit wide interface would be attractive for HTPC/Steam boxes. I think it could even help the current 12 CU model quite a bit as it appears severely bandwidth constrained. Another thing to account for would be that you wouldn't have to spring for super fast 2100+ memory, you could likely get away with four DDR3 1600 or 1866 DIMMS and still see substantial gains. Of course with this configuration you would have to sacrifice ITX form factor motherboards and go with M-ATX and I'm sure the power consumption would hurt a little.Computer Bottleneck - Friday, January 17, 2014 - link
On Mini-ITX four SO-DIMMs should fit.P.S. Going by current prices on Newegg there is only a $14 price difference if a person buys 4 x 2GB DDR3-1600 SO-DIMMs ($84) compared to 2 x 4GB DDR3-1600 SO-DIMMs ($70). I'd say that is worth it for those of us who want 8GB RAM.
Aisalem - Friday, January 17, 2014 - link
I guess the best option will be Mini-ITX board with soldered 8GB GDDR5 on it (lets say opposite CPU socket covered with backplate). That should be enough for HTPC for next few years.just4U - Friday, January 17, 2014 - link
I am guessing something like that would add 80-120 onto the price tag but hell.. I'd pull the trigger and buy it. "take note" Asus/Gigabyte/Msi ..ImSpartacus - Saturday, January 18, 2014 - link
I'd rather see custom solutions.An APU already has a measure of integration (and therefore less upgradeability) with the combined CPU & GPU, so I think AMD should just keep going with equally un-upgradeable GDDR5.
Then they need to exploit that integration potential! No one whines that the new Mac Pro should have upgradeable GPUs because it's obvious that the Mac Pro form factor would then be impossible.
AMD needs to put a ~16+ CU, GDDR5-packing Kaveri in a form factor so awesome that NO ONE would want to upgrade the RAM, CPU or GPU.
Make my next gaming machine, AMD.
lmcd - Saturday, January 18, 2014 - link
Or they could make a BRIX out of a 4-channel Kaveri with RAM soldered on.just4U - Sunday, January 19, 2014 - link
not sold on Brix due to heat issues.. so I'd personally prefer solutions that the end-user decides on case/cooling options.Computer Bottleneck - Sunday, January 19, 2014 - link
Agreed.Brix size (even if it is the Pro Model) is just too small.
For DIY: Mini-ITX or Micro ATX.
For prebuilt: Something the size of a Gateway SX, Lenovo H530s, Dell Inspiron 660s SFF desktop would be great. (Make it higher performance and a good value for the money)
Computer Bottleneck - Sunday, January 19, 2014 - link
Added Note: The current Lenovo H530s SFF desktop supports up to a 84 watt Core i7-4770. The old H520s (same chassis) supported up to a 95 watt Core i7 Sandy Bridge according to documentation.So if AMD is planning on larger desktop APUs I would like to see 95 watts for the top OEM APU (up from the current level of 65 watts) and a TDP higher than that (maybe 125 watts?) for the top enthusiast level K model APU.
This, of course, provided AMD can increase memory bandwidth sufficiently.
StevoLincolnite - Saturday, January 25, 2014 - link
I think a better option would be to stick with the same dual-channel DDR3 set-up from a price perspective, but then re-introduce "Display Port" where board manufacturers can then bundle a chunk of faster memory on the motherboard. (Like 512Mb-1024Mb GDDR5.)Once the GDDR5 is filled it falls back to DDR3.
Should help with more size-constrained scenarios too as a couple of memory chips should take up far less board space than 4x DDR memory slots especially in ITX...