Xilinx Announces Project Everest: The 7nm FPGA SoC Hybrid
by Ian Cutress on March 19, 2018 7:40 AM ESTInterview with Victor Peng, CEO of Xilinx
Despite only being the job for a little over a month, Mr. Peng is well positioned at the helm of Xilinx’s future. He has been at the company for over ten years, having previously been COO (managing Global Sales, Product and Vertical Marketing, Product Development and Global Operations) and EVP/GM of Products. His background is in engineering, holding four US patents and a Masters’ from Cornell, completing the trifecta of research, sales, and marketing that typically a CEO requires in their toolset. Prior to Xilinx, Mr. Peng was Corporate VP of the Graphics Products Group for silicon engineering at AMD, as well as having spent time at MIPS and SGI.
Ian Cutress: Congratulations on the CEO position! What has it been like for the first month? Have you had to hit the ground running?
Victor Peng: Oh thank you, I appreciate it! It has been pretty busy, but it is as expected I suppose. It is very exciting! It is busy for a few reasons, as this is our 4th quarter, so we are not only getting ready to finish the fiscal year, but also we are planning all the budgeting (laughs). It is just a busy quarter in general. We have a new Sales VP who just started yesterday, so I've actually been running sales and I've been wearing two hats so it's been pretty busy. Starting to talk about the new vision and strategy is also part of the journey, but it is all good.
IC: Is the new strategy an opportunity to put your stamp on the role?
VP: It is, but I don't think of it that way. I have this steward model of the leadership – Moshi [Gavrielov, former CEO] took the company to a great place and we have this great foundation that we've built up and all these strengths that we have. You know obviously I was a part of that too in terms of when I was on the product side, but I just think that this is the right time, both from the kind of the things that we've built up internally in terms of products and technology, but also from the industry perspective like we discussed, the nature of the workloads that are going to be driving computing from the end to end. It is fundamentally different than what it was a few years ago, for example, somebody asked the question could you have done an ACAP prior to 7nm. Well we could, it would be in certain respects maybe a little less powerful, more costly perhaps, but the kind of the coming together of all those things together at 7nm makes this just the right time for the company to take this more quantum leap. It's not about legacy, but of course it also remains to be seen how long I will be CEO!
IC: The march into the data center is a story that a number of companies have been saying over the past few years. So the topic of the day is obviously the new hardware that is still a year away - what makes now the right time to discuss the ACAP and the approach into the data center?
VP: From a compute perspective we are already in the data center - Amazon today deploys our 16nm products, and some others even deploy 20nm, but most of it for now is 16nm and I think 16nm will certainly continue for the next couple of years as we bring in our 7nm product. The last year was the first year we really made very significant progress in the data center, not only because that was the year that FaaS [FPGA as a Service] got announced and deployed, but because of other engagements that we have such as storage as well as in networking. I think the number of businesses that we are engaged with all the hyperscalers as well as some OEMs. The models that people are doing are not only FPGA servers, but also just internal private acceleration.
I think you know we started before 7nm and I think that's a good thing because if we were starting just at 7nm and people didn't even have the familiarity with us then why they would use our platform to begin with? We would have an even bigger hurdle but I think that now that 16nm will get the mindset out there, it will hopefully give people a little bit of understanding as to what we offer and why it has value.
7nm ups the level of our platform significantly, and I think during the 7nm time frame is certainly I think we will have growth and a reflection of our product portfolio.
IC: Is there no desire to use half nodes like 10nm?
VP: We didn't do that for a variety of reasons. For one, if I just looked at our product cadence and when 10nm would come out, it was not a good lineup. But the other thing is that the delta between 10nm and 16nm, compared to 7nm and 16nm, is much less significant. Let’s face it, as you follow a lot of tech, it is really the handsets that need this almost annual thing. Because Moore's Law is no longer on track, everybody is coming out with variants to their process: plus, plus plus, and this-that –and-the-other-thing. This is largely because the handset guys, and every year they have to come out with something because every year there is a Christmas holiday and a consumer business.
So for not only us, but other kinds of more high-end computing applications, they don't refresh every year so we've just picked the right technology so.
IC: For the ACAP, it feels as if Xilinx is moving some way between a FPGA and an ASIC, having hardened controllers on silicon and implementing more DSPs and application processors. Is that a challenge to your customers that are currently purely FPGA based?
VP: Well I first want to challenge that a little bit. I think in a sense you are looking at an implementation perspective, and if you are saying that there are blocks that are implemented with ASIC flow, that is true. But even the things that we are implementing with ASIC flow, we are still maintaining some level of hardware programmability. At the end of the day, we are going to play to our strengths, right? Our strength is our understanding in how to enable people to use the power of hardware programmability. We crossed the chasm of being only hardware programmable to being software and hardware programmable.
It's a little bit hard to talk the new product without pre-announcing some of the features, but I talked about hardware software programmable engine, which won't be an engine that is customizable down to a single bit, but it will have notions of some granular data paths and memory and things like that, and it has some overlay of an instruction set, but while most people won't program it, it still has hardware programmability. We’re not just somebody else coming out with a VLI doping multi core architecture because then what is the difference between us and someone else right? There's always going to be that secret sauce.
Also, there's no value for us in creating a custom CPU, like an Arm SoC - not for the applications we want. If we were doing a server class processor you would need to take an Arm architecture license and create your own micro-architecture, but there's no value in us doing that, but we have embedded application cores in our ACAP. So, some of the things are implemented from an ASIC flow and is just like a hard ASIC as any other one. But also I'm not going to create a GPU architecture either - some of our products have a GPU in it, a very low GPU more for 2D kind of interfaces and stuff like that.
But when there is heavy lifting, like acceleration, there I will always try to find a way where we could add value in terms of the programmability. For the challenge of the programmability to the customer, we have already crossed that chasm once with Zynq, so it doesn’t need be a hardware expert anymore to use these. There is a software team working with that and now with FDSoC and some of these design environments you don't absolutely need the hardware design anymore because there's just many more systems and software developers than there are FPGA designers, right?
IC: One of the things that came out today with was your discussion on how implementing libraries and APIs to essentially have the equivalent of CUDA stack for the Xilinx product line. Training people on that sort of level of API is something that NVIDIA has done for the last 8 years. Is that the sort of training program that is appealing to help improve the exposure of these types of products?
VP: I'm not the one to get into the details, but I actually have people on my team that feel like we're even easier to use than CUDA! But I what I would say is in general we are going to try enable more innovators, and if the innovators are used to things like the TensorFlow Framework, we'll do that. You know we have even enabled python, right? Because of younger programmers it is probably more relevant to do python than have to do something like C or C++ or something that. But in other areas people still develop to those other languages, so you know it is really all about us trying to enable more innovators with the framework and development experience they are used to – we are going to try match those as much as possible
At some level there isn't going to be a compile step, which isn't going to be like software. Because it is physics, we are actually not going to have to do something but it doesn't mean that they always have to do it. As you can imagine when you are in a development cycle you could do very quick compiles and then work out when you're doing. Here's the production thing though - you could take longer with a new platform, but we're trying to minimize that. The general mantra is to try and make the experience like any other software target or platform, but getting a custom hardware engine underneath the hood and they don't have to really muck with it.
IC: I mean with something like the APAC it is clear that cloud providers would other this as a service, as they want a multi-configurable system for their users. But beyond the cloud providers, are people actually going to end up deploying it in their own custom solutions -with this amount of re-configurability, do you actually see them using it in a ‘reconfigurable way’ rather than just something that's fixed and deployed?
VP: That's a good point - I think that it wouldn't be necessarily that everybody dynamically configures it when it is deployed, but we do see that and I'll give you an example as far away from the cloud as you could imagine. There are a lot of people in testing instruments. Now some test instruments are kind of like a reconfigurable panel, with people moving the panels that have hard knobs, so if you can virtualize the interface when they select things, they can just reconfigure it to do a different thing with some of the guts of the electronics - for example, a situation where eight analogue testing components are being rotated. Like anything else, they try to move to digital as soon as possible, so the ability to completely reconfigure something is vital.
IC: Would you consider that a mass scale deployment, rather than prototyping?
VP: No it's not prototyping – it is a tester, but somebody using it on the bench. If they want the scope to do this or that, they can change the functionality on the fly. It is deployed, right, up and down in many engineering workshops.
That's a good example, especially if we look at telecommunications. 5G is not really a unified global standard – it is multiple standards that vary by geography, by region, and because it's going to be around connectivity there are new band radios, as well as different form factors and different regions and even down to different large customers that demand certain things. So our traditional customers in communications still need some degree of customization. One of them said to me that last year they had to do on average 2 + radios a week, because they always need to customize something about the frequency bands or the wavelengths or something about the radio. So even if it is stuff they have already built, they always still have to have some degree of customization now. Within that deployment, there will be a customer that has to change things? I could tell you actually in aerospace and defense, there are applications for security reasons, people actually want to reconfigure dynamically when it's deployed.
So it's a range - some will reconfigure when it's deployed, some will do it when they are trying to scale their product line, but in the cloud whether it's public or private I think, clearly there will be different workloads right.
IC: The ACAP schedule is for tape out this year, with sales and revenue next year. Aiming at 7nm means you're clearly going for the high end, and previously you said 16nm would have had deficits compared to using 7nm. In time do you see this sort of type of ACAP moving down into the low end?
VP: I'll put it this way - our selection of technology is primarily based on what is the right match for the end solutions. Last year we did some new 28nm tape outs, despite the fact that we had finished our original plans at 28nm a long time ago. But for the more cost and power sensitive segments, and we're talking about $10 a blow kind of segments, you don't need 20nm or 16m, but it might be kind of hard to reach that price point too in a lot of cases. So, we will still do some of those older technologies for other products, but there are some new IP blocks that are brand new only in 7nm. It is quite costly to implement a new IP block and the technology after all. But if there is a big enough opportunity and we have important enough customers or a growth segment, we would do it.
IC: Any desire to make products like these act as hosts, rather than just attached devices or attached accelerators?
VP: Well yeah, we have the Zynq product line in the data center today, and pretty much people are using pure FPGAs in the compute side and in some other areas people are finding use for a remote processor that doesn't have to do as much heavy lifting - in fact in most cases it's doing some sort of management function. So, I’d say yes, today because it's mainly a pure accelerator and it needs a local host, but I think in the future we will see that change.
I mean basically you don't want to have to go back to that host – it is costly on multiple levels and if you have a host that can do it locally, such as the embedded Arm core, you don't want to use that expensive CPU cycle that's remote, as overall you lose performance.
IC: Back at Supercomputing we saw a research development project about just network attached accelerators, so not even with a host. You just attach it in and...
VP: Exactly, and actually that's how Microsoft has eliminated this issue – it is connected to the network and so they can talk peer to peer to others. They don't necessarily even have to go to the CPU at all, but if they do, generally it's just like a CPU process - it's more like a batch, do a big job, come back, so you don't have to have lots of communication.
IC: Obviously the biggest competitor you have in this space (Intel/Altera) has different technologies for putting multiple chips together (EMIB). Is Xilinx happy with interposers? Is there research into embedded package interconnects?
VP: I think it's all about what's the problem trying to solve. If you need massive amounts of bandwidth, the interposer, is the way to go, based on TSMCs CoWoS technology, which we call SSIT. You know InFO (TSMC’s Integrated Fan Out) is much more limited, so if you don't need huge amounts of bandwidth and you don't need to connect many chips, InFO could be interesting.
It's all about technology. I mean we're not like religious about it, and if you could do what you want with InFO or a traditional MCM then OK, but it is really all about what you're trying to do. But yeah, we are quite comfortable because generally for the problems we are trying to solve, there is a need for a lot of bandwidth and fairly low latency. We've been doing this since 28nm, so you know 16nm is like our 3rd generation and 7nm will have both monolithic chips and we'll still use interposer technology. I mean that's why we could do this chip right [holds up Virtex UltraScale+ with HBM].
IC: There's a bit of a kerfuffle with your competitors about who was the first to have HBM2, comparing announcing versus deploying.
VP: I was just going say! Even before they were part of Intel, Altera was pretty good about announcing and showing technology demonstrations, but when you look when you can actually deploy something into production that is another story. You know more recently we've actually both announced earlier and deployed earlier, but in this particular case, okay they claim that their getting out there sooner, but we'll see who gets out there in production. Because remember, EMIB has never seen production in anything yet,, so you know, this is like I said - our 3rd generation of production and high volume is very reliable for our kind of customers.
IC: With Zynq already using Arm processors, will the ACAP application processors also be Arm, or if they can have RISC-V cores, or others?
VP: We're still going with the ARM architecture. I think it is the broadest set of architecture for embedded applications and, you know in the case of the heavy-duty kind of acceleration where the host is off board, it could be anything. For the off board host, we interface with OpenCAPI, and obviously we're driving CCIX, and we are working with PCIe and at SC17 we had a demo with AMDs EPYC server together with our processor.
IC: For Project Everest, can you comment on security and virtualization?
VP: We've been working on security quite a bit. We traditionally leverage a lot of things into the more pure FPGA but with Everest, as they all have SoC capabilities, everything will be leveraged together. I think everybody knows that with the world becoming very smart and connected it means the attack surface is pretty much the entire world. Since we are in applications like automotive, they care a lot about that. The hyper-scalers care a lot about security as well, except they don't have to worry about physical tampering as much, but in the automotive we actually have to care about physical tampering as well as software attacks. So you know we do TrustZone, we have all kinds of internal boots so that people won't see memory traffic over interfaces. We have stuff for defeating DPA, we have all kinds of things. In fact, if we detect anomalies, it auto shuts down. Because we do get a lot of very secure customers, such as the aerospace and defense industries, we work with the agencies that are very sensitive to this thing.
We also get involved in safety, and it is interesting because they are not exactly the same thing - safety is about ensuring the thing doesn't harm people, and security is about people not harming the thing. But we're finding more and more applications that care about both. An example of that might be robotics, drones, etc.
One other thing about security that I think is really good about our technology - the security people really like it if you could do redundancy, but implement it different ways. Because of the diversity, if you attack one thing, it doesn't mean you can attack the other. The fact we can run things in the applications processer, the real time R5's, and in the fabric, means the level of diversity you can get and then poll to see if you get the same results is quite a bit richer than you know what fixed functions chip could do.
IC: With Everest, with things like the on-chip interconnect, is that Xilinx's IP?
VP: The NOC that is connected to all the blocks, that is our design. Internally in our SoC, we actually licensed a 3rd party NOC. But the reason we chose to the main NOC proprietary is because the requirements we needed for this adaptability a little different. If it was look like everything was an ASIC block, we could in theory use a licensed IP, but there are some unique things to do when you have these very flexible things that are being implemented in lots of different ways.
IC: So what sort of different things can you do?
VP: This is probably where I should get one of my architects! It's probably best for me to get into comparing some of these architectures from other companies - we looked at them, and as I mentioned we actually use one of them within the SoC block. But we found that we can't use it for our own and they didn’t have the tools and things that you have to give people in order to interface to that.
IC: Most of our readers are familiar with the ARM interconnects because of mobile chips, so I could assume that you'll have one of those inside the SoC block.
VP: We do, and it in fact most of our IP, even in our soft IP blocks, most of the interfaces are very AXI like. We help drive some of the design - they had heavyweight AXI for some blocks, and you really don't need that heavyweight in a sense, so we work with them on the streaming, using a more slim AXI. What I should really do is a good point that when we do announce the details of the ACAP we can compare and contrast the differences that help people because they have a reference point.
Many thanks to Victor Peng and his team for their time.
Also thanks to Gavin Bonshor for transcription.
16 Comments
View All Comments
flgt - Monday, March 19, 2018 - link
As general purpose computing starts to run out of steam whoever can bring developer friendly programmable logic to traditional software programmers will take the lead, Intel or Xilinx. It’s a tough problem to solve. As a more traditional FPGA user, I hope we don’t get left behind by the data center market though.flgt - Monday, March 19, 2018 - link
I did want to say this is a good approach. Have expert HW engineers at Xilinx build the most common accelerators for software programmers to invoke. There is so much logic available now that no one will blink an eye at the cost of the abstraction layer.bryanlarsen - Monday, March 19, 2018 - link
As a more traditional FPGA user, do you not think that much of this initiative will benefit you as well? It seems to me that converting hard blocks into something "semi-solid" will be good for you too.flgt - Monday, March 19, 2018 - link
You're right, up to this point it has been a win-win. We've gotten huge logic increases at a fraction of the cost from 10 years ago, and Xilinx has supported us well. My worry was this statement below from the article. If you're a low volume customer that needs a device for 10 years you might be SOL in the new era of semiconductor vendor consolidation. They're all chasing the big markets to pay for their billion dollar developments: 5G, datacenter/networking, autonomous vehicle, etc. Also, sometimes you need to know what's inside these vendor HDL/SW black boxes so using them get's tricky. That being said, FPGA's naturally scale across a number of broad markets and the vendors don't have to make as many tough product line choices as ASIC's."Xilinx’s key headline that it is focusing on being a ‘Data Center First’ company. We were told that the company is now focused on the data center as its largest growth potential, as compared to its regular customer base, the time to revenue is rapid, the time to upgrade (for an FPGA company) is rapid, and it enables a fast evolution of customer requirements and product portfolio."
modport0 - Tuesday, March 20, 2018 - link
Xilinx's HLS capabilities give decent results especially if you're at the prototyping stage for processing algorithms. The HLS tools from Mentor and Cadence are highly praised by ASIC developers but if your company is able to justify 6 digit license costs per seat per year. Intel's HLS tool just came out of beta so it's probably not worth using right now. Intel's OpenCL framework has been out for a while but I haven't personally used it.For their data center oriented framework, Intel has their OPAE (https://01.org/OPAE) software/driver stack coupled with the rest of their acceleration stack (https://www.altera.com/solutions/acceleration-hub/... Their hope is that the SW dev just writes the user application and the RTL person (maybe the same person) just creates accelerator modules (via HDL, HLS or OpenCL).
Who knows if any of this ends up being used in data centers beyond research/small deployments. Google and I'm sure others are already looking into ASICs and skipped on FPGAs due to various reasons.
jjj - Monday, March 19, 2018 - link
EMIB gets complicated when you have to align more than 2 chips and that seems to be why OSATs don't quite like such solutions and are looking at other budget options.Krysto - Monday, March 19, 2018 - link
I don't know what's the future of FPGAs, but if Xilinx wanted to beat Altera, now it's the time, as Intel will surely bungle it. Intel already seems to be re-focusing on its own GPUs, so I bet it's already regretting buying Altera and thinking how to get rid of it in 2-3 years without looking like complete fools (again).amrs - Monday, March 19, 2018 - link
It would be interesting to know just how much data center business Xilinx actually does. As a point of comparison, Dini Group's president said last year they've had zero sales with their data center oriented products.BTW, it's Zynq, not Zync.
ProgrammableGatorade - Tuesday, March 20, 2018 - link
Dini makes great hardware but it's expensive, which could be skewing their perception of the market. If you're just buying one of their ASIC emulation boards it's easy to justify the cost, but if you're buying dozens or hundreds of boards you'll probably go to AlphaData, Bittware, or even Xilinx directly.Space Jam - Monday, March 19, 2018 - link
>The idea here is that for both compute and acceleration, particularly in the data center, the hardware has to be as agile as the software.This makes no sense given the properties of hardware development not catering to the adaptive, evolving nature of development in an agile philosophy but I suppose throwing everything and the kitchen sink at a problem is as close as you can get to an 'agile' solution in hardware development.
>In recent product cycles, Xilinx has bundled new features to its FPGA line, such as hardened memory controllers supporting HBM, and embedded Arm Cortex cores to for application-specific programmability.
First page, after "Arm Cortex cores" that should be a too not to.