How Amazon's Secret Weapon in Chip Design is Amazon (ieee.org)
- Reference: 0175002499
- News link: https://hardware.slashdot.org/story/24/09/15/1954224/how-amazons-secret-weapon-in-chip-design-is-amazon
- Source link: https://spectrum.ieee.org/amazon-ai
The article argues that while AMD, Nvidia, and other big-name processor companies may also want to control the full stack (purchasing server, software, and interconnect companies) — Amazon Web Services "got there ahead of most of the competition." (IEEE Spectrum interviews Ali Saidi, technical lead for the AWS Graviton series of CPUs, and Rami Sinno, director of engineering at Annapurna Labs, on "the advantage of vertically-integrated design — and Amazon-scale...")
> Sinno: I was working at Arm, and I was looking for the next adventure, looking at where the industry is heading and what I want my legacy to be. I looked at two things: One is vertically integrated companies, because this is where most of the innovation is — the interesting stuff is happening when you control the full hardware and software stack and deliver directly to customers.
>
> And the second thing is, I realized that machine learning, AI in general, is going to be very, very big. I didn't know exactly which direction it was going to take, but I knew that there is something that is going to be generational, and I wanted to be part of that. I already had that experience prior when I was part of the group that was building the chips that go into the Blackberries; that was a fundamental shift in the industry. That feeling was incredible, to be part of something so big, so fundamental. And I thought, "Okay, I have another chance to be part of something fundamental."
>
> [...] At the end of the day, our responsibility is to deliver complete servers in the data center directly for our customers. And if you think from that perspective, you'll be able to optimize and innovate across the full stack. It might not be at the transistor level or at the substrate level or at the board level. It could be something completely different. It could be purely software. And having that knowledge, having that visibility, will allow the engineers to be significantly more productive and delivery to the customer significantly faster. We're not going to bang our head against the wall to optimize the transistor where three lines of code downstream will solve these problems, right...?
>
> We've had very good luck with recent college grads. Recent college grads, especially the past couple of years, have been absolutely phenomenal. I'm very, very pleased with the way that the education system is graduating the engineers and the computer scientists that are interested in the type of jobs that we have for them.
It's an interesting glimpse into the unique world of designing chips at Amazon.
Graviton technical lead Saidi: I've been here about seven and a half years. When I joined AWS, I joined a secret project at the time. I was told: "We're going to build some Arm servers. Tell no one...
"In chip design, there are many different competing optimization points. You have all of these conflicting requirements, you have cost, you have scheduling, you've got power consumption, you've got size, what DRAM technologies are available and when you're going to intersect them... It ends up being this fun, multifaceted optimization problem to figure out what's the best thing that you can build in a timeframe. And you need to get it right."
[1] https://spectrum.ieee.org/amazon-ai
If you want to cut to the chase (Score:3)
Here's what Amazon's in-house efforts have bought them:
[1]https://www.phoronix.com/revie... [phoronix.com]
If you like what's on offer then you'll like Graviton 4 better than 3 (or 2), but it still gets whipped by EPYC instances. However, if you read between the lines, Graviton 4 is winning on a performance per dollar basis, at least in db workloads.
[1] https://www.phoronix.com/review/aws-graviton4-benchmarks/7
Re: If you want to cut to the chase (Score:1)
At the scale of AWS, the savings of an in-house CPU might be large enough to instead do things like offer EC2 at-cost, which would quickly eat up a lot of marketshare of arch-agnostic code (most modern code that's not heavily optimized in assembler).
Diminishing returns (Score:2)
First, it's not AMD and Nvidia and Intel that want to control the full stack, it's Amazon, Microsoft, Google, Facebook and etc. where that's trendy. Second if every major datacenter company built their own hardware stack top to bottom, they compete with every other major datacenter company for every hardware engineer in existence. Then they have to design every part of the stack top to bottom independently. Then they have to implement it independently. This entire argument hinges on the assumption of a zero
Re: Diminishing returns (Score:1)
Is it? Intel is in a pre-split AMD-GlobalFoundries situation, where they have been unable to turn around their fab business and now need to use external foundries to stop the bleeding of customers for cutting edge. Unless Taiwan gets invaded in the next three years, I don't see Intel getying ita house in order. NVidia seems to mostly be interested in the AI play and less on pure HPC. AMD may use their position to increase margins on their EPYC line. Makes perfect sense to cut that margin with chip designs t
Hey (Score:1)
Amazon? Since you're not going to do anything substantial with publishing, ebooks, audiobooks or comics, could you at least tell authors and readers in advance and stop wasting their time and money?
While you're at it, can we have the .book TLD back? Shouldn't have been allowed to buy it in the first place.
P.S. Eat shit.
Agile/ Continuous Integration of soft/hardware (Score:2)
> Saidi: This might sound weird, but I’ve seen other places where the software and the hardware people effectively don’t talk. The hardware and software people in Annapurna and AWS work together from day one. The software people are writing the software that will ultimately be the production software and firmware while the hardware is being developed in cooperation with the hardware engineers. By working together, we’re closing that iteration loop. When you are carrying the piece of hardware
Re: (Score:2)
The consequence if that if your engineering team misses something fundamental, then you now have it both hardware and software.