Depression Sets in but the Team Goes On

The entire RV770 design took around three years, which means that while we were beating ATI up over the failure that was R600, those very engineers had to go into work and be positive about RV770. And it was tough to, after all ATI had just completely lost the crown with R600 and Carrell, Rick Bergman and others were asking the team to ignore what happened with R600, ignore the fact that they lost the halo, and try to build a GPU that aimed at a lower market segment.

Through all of my interviews, the one thing that kept coming up was how impressed ATI was with the 770 team - never once did the team fall apart, despite disagreements, despite a shaky direction, the team powered through.

The decision not to go for the king of the hill part was a decision that made a lot of sense with ATI, but there was so much history about what would happen if you didn’t get the halo part; it took a very strong discipline to cast history aside and do what the leads felt was right, but the team did it without question.

The discipline required wasn’t just to ignore history, but to also fight the natural tendency for chips to grow without limits during their design phase. What ATI achieved with RV770 reminded me a lot of Intel’s Atom design team, each member of that team had strict limits on how big their blocks could be and those limits didn’t waver.

Adversity tends to bring the best out of people. The best stories I’ve been told in this industry, the Intel folks who made Banias and the ATIers that were responsible for RV770 put their hearts and souls into their work, despite being beat down. Passion has a funny way of being a person’s strongest ally.

The Power Paradigm

We were all guilty for partaking in the free lunch. Intel designed nearly five years of processors without any concern for power consumption and the GPU guys were no different.

In the R300 and R420 days ATI was almost entirely ignoring power, since estimating how much power the parts would use was so off from the final product that they just didn’t care. It was such a non-issue in those days that ATI didn’t even have a good way to estimate power even if it wanted to, it was impossible to design for a specific TDP. Today ATI’s tools are a lot better, now targeting a specific TDP is no different than aiming for a specific clock speed or die size, it’s another variable that can now be controlled.

These days power doesn’t change much, the thermal envelopes that were carved out over the past couple of years are pretty much stationary (ever wonder why the high end CPUs always fall around 130W?). Everyone designs up to their power envelope and stays there. What matters now is every year or two increasing performance while staying within the same power budget. Our processors, both CPUs and GPUs, are getting more athletic, rather than just putting on pounds to be able to lift more weight.

One of the more interesting things about architecting for power is that simply moving data around these ~1 billion transistor chips takes up a lot of power. Carrell told me that by the time ATI is at 45nm and 32nm, it will take as much power to move the data to the FPU as it does to do the multiply.

Given that data movement is an increasingly power hungry task a big focus going forward is going to be keeping data local when possible, minimizing moving to registers and on-chip caches. We may see more local register files and more multi-tiered memory hierarchies. As chips get more complex, keeping the register file in one central location becomes a problem.

ATI admitted to making a key manufacturing mistake with R600. The transistor technology selected for R600 was performance focused, designed to reach high clock speeds and yielded a part that didn’t have good performance per watt - something we noticed in our review. ATI has since refocused somewhat away from the bleeding edge and now opts for more power efficiency within a given transistor node. With leakage a growing problem as you go to smaller transistors it’s not worth it to be super leaky to gain a few picoseconds. If you’ve got a 100W GPU, do you want to waste 40W of that budget on leakage? Or would you rather do 80W of real work and only waste 20W? It’s the same realization that Intel recognized during the Pentium 4’s term and it’s the mentality that gave us the Core microarchitecture. It’s an approach that just makes sense.

If it Ain’t Broke... Just One Small Problem: We Need a New Memory Technology
Comments Locked

116 Comments

View All Comments

  • Chainlink - Saturday, December 6, 2008 - link

    I've followed Anandtech for many years but never felt the need to respond to posts or reviews. I've always used anandtech as THE source of information for tech reviews and I just wanted to show my appreciation for this article.

    Following the graphics industry is certainly a challenge, I think I've owned most of the major cards mentioned in this insitful article. But to learn some of the background of why AMD/ATI made some of the decisions they did is just AWESOME.

    I've always been AMD for CPU (won a XP1800+ at the Philly zoo!!!) and a mix of the red and green for GPUs. But I'm glad to see AMD back on track in both CPU and GPU especially (I actually have stock in them :/).

    Thanks Anand for the best article I've read anywhere, it actually made me sign up to post this!
  • pyrosity - Saturday, December 6, 2008 - link

    Anand & Co., AMD & Co.,

    Thank you. I'm not too much into following hardware these days but this article was interesting, informative, and insightful. You all have my appreciation for what amounts to a unique, humanizing story that feels like a diamond in the rough (not to say AT is "the rough," but perhaps the sea of reviews, charts, benchmarking--things that are so temporal).
  • Flyboy27 - Friday, December 5, 2008 - link

    Amazing that you got to sit down with these folks. Great article. This is why I visit anandtech.com!
  • BenSkywalker - Friday, December 5, 2008 - link

    Is the ~$550 price point seen on ATi's current high end part evidence of them making their GPUs for the masses? If this entrire strategy is as exceptional as this article makes it out to be, and this was an effort to honestly give high end performance to the masses then why no lengthy conversation of how ATi currently offers, by a hefty margin, the most expensive graphics cards on the market? You even present the slide that demonstrates the key to obtaining the high end was scalability, yet you fail to discuss how their pricing structure is the same one nVidia was using, they simply chose to use two smaller GPUs in the place of one monolithic part. Not saying there is anything wrong with their approach at all- but your implication that it was a choice made around a populist mindset is quite out of place, and by a wide margin. They have the fastest part out, and they are charging a hefty premium for it. Wrong in any way? Absolutely not. An overall approach that has the same impact that nV or 3dfx before them had on consumers? Absolutely. Nothing remotely populist about it.

    From an engineering angle, it is very interesting how you gloss over the impact that 55nm had for ATi versus nVidia and in turn how this current direction will hold up when they are not dealing with a build process advantage. It also was interesting that quite a bit of time was given to the advantages that ATi's approach had over nV's in terms of costs, yet ATi's margins remain well behind that of nVidia's(not included in the article). All of these factors could have easily been left out of the article altogether and you could have left it as an article about the development of the RV770 from a human interest perspective.

    This article could have been a lot better as a straight human interest fluff piece, by half bringing in some elements that are favorable to the direction of the article while leaving out any analysis from an engineering or business perspective from an objective standpoint this reads a lot more like a press release then journalism.
  • Garson007 - Friday, December 5, 2008 - link

    Never in the article did it say anything about ATI turning socialistic. All it did mention was that they designed a performance card instead of an enthusiast one. How they approach to finally get to the enthusiast block, and how much it is priced, is completely irrelevant to the fact that they designed a performance card. This also allowed ATI to bring better graphics to lower priced segments because the relative scaling was much less than nVidia -still- has to undertake.

    The built process was mentioned. It is completely nVidia's prerogative to ignore a certain process until they create the architecture that works on one they already know; you are bringing up a coulda/woulda/shoulda situation around nVidia's strategy - when it means nothing to the current end-user. The future after all, is the future.

    I'd respectfully disagree about the journalism statement, as I believe this to be a much higher form of journalism than a lot of what happens on the internet these days.

    I'd also disagree with the people who say that AMD is any less secretive or anything. Looking in the article there is no real information in it which could disadvantage them in any way; all this article revealed about AMD is a more human side to the inner workings.

    Thank you AMD for making this article possible, hopefully others will follow suit.
  • travbrad - Friday, December 5, 2008 - link

    This was a really cool and interesting article, thanks for writing it. :)

    However there was one glaring flaw I noticed: "The Radeon 8500 wasn’t good at all; there was just no beating NVIDIA’s GeForce4, the Ti 4200 did well in the mainstream market and the Ti 4600 was king of the high end. "

    That is a very misleading and flat-out false statement. The Radeon 8500 was launched in October 2001, and the Geforce 4 was launched in April 2002 (that's a 7 month difference). I would certainly hope a card launched more than half a year later was faster.

    The Radeon 8500 was up against the Geforce3 when it was launched. It was generally as fast/faster than the similarly priced Ti200, and only a bit slower than the more expensive Ti500. Hardly what I would call "not good at all". Admittedly it wasn't nearly as popular as the Geforce3, but popularity != performance.
  • 7Enigma - Friday, December 5, 2008 - link

    That's all I have to say. As near to perfection as you can get in an article.
  • hanstollo - Friday, December 5, 2008 - link

    Hello, I've been visiting your site for about a year now and just wanted to let you know I'm really impressed with all of the work you guys do. Thank you so much for this article as i feel i really learned a whole lot from it. It was well written and kept me engaged. I had never heard of concepts like harvesting and repairability. I had no idea that three years went into designing this GPU. I love keeping up with hardware and really trust and admire your site. Thank you for taking the time to write this article.
  • dvinnen - Friday, December 5, 2008 - link

    Been reading this site for going on 8 years now and this article ranks up there with your best ever. As I've grown older and games have taken a back seat I find articles like this much more interesting. When a new product comes out I find myself reading the forwards and architectural bits of the articles and skipping over all the graphs to the conclusions.

    Anyways, just wish I was one of those brilliant programmers who was skilled enough to do massively parallelized programming.
  • quanta - Friday, December 5, 2008 - link

    While the RV770 engineers may not have GDDR5 SDRAM to play with during its development, ATI can already use the GDDR4 SDRAM, which already has the memory bandwidth doubling that of GDDR5 SDRAM, AND it was already used in Radeon X1900 (R580+) cores. If there was any bandwidth superiority over NVIDIA, it was because of NVIDIA's refusal to switch to GDDR4, not lack of technology.

Log in

Don't have an account? Sign up now