AMD takes hybrid approach to engineering the cloud’s future

AMD CIO Hasmukh Ranjan sits at the cloud’s crossroads. As a chipmaker, AMD is a vital supplier for the public cloud’s compute engine, and among Ranjan’s key remits is to support the engineering of semiconductors that power the cloud. But as a consumer, Ranjan, like all CIOs, must decide where best to place his company’s workload bets. And for AMD’s most critical engineering applications, the answer remains its own data centers — not the cloud.

That’s because chipmakers like AMD require mega cores of compute power and memory, as well as petabytes of storage, to run their design applications. Still, one year into his post, Ranjan says nearly 95% of AMD’s business applications run on public clouds.  It’s just that the mammoth engineering applications AMD creates for making the processors won’t run on the cloud, Ranjan says.  

“For engineering, for our sweet spot, cloud providers don’t have those high-end machines we’re looking for,” he says, noting that AMD’s design applications require up to 64GB per core “and we stretch up to 2 to 4 terabyte systems as well.”

And those massive requirements continue to grow in three vectors — “variety, velocity, and volume,” Ranjan says, alluding to AMD’s broadening product portfolio, the high speed of AMD’s design work, and the vast amount of data generated in the chip design process. 

Because of this, Ranjan expects AMD’s digital infrastructure will remain hybrid for some time, with business processes in the cloud and engineering on-premises until massive HPC workloads are widely supported on the public cloud. Gartner analyst Sid Nag, however, points out that cloud providers such as Amazon Web Services offer instances that go up to 224 cores, with companies already running HPC workloads in the cloud.

The shifting nature of chip design

Not all of AMD’s chip engineering process is performed on-premises, Ranjan says, noting that between 10% and 15% of AMD’s computations occur on the cloud, typical for the industry.

Because of the engineering requirements, most chipmakers work with electronic design automation (EDA) vendors such as Cadence Design Systems, Synopsys, and Siemens on-premises from start to finish — serving the final blueprints of the designs directly from the data center to the manufacturing partners and fabs. This tightly integrated process also guarantees data integrity and security.

But that is changing. AMD’s Ranjan points to Marvell Semiconductor’s partnership with AWS, announced in February, as an indicator that semiconductor companies want to use the cloud more in all aspects of their production. According to the announcement, Marvell selected AWS as its cloud provider for EDA in order to take a cloud-first approach to chip design.

“But this industry has been a bit slow in adopting the public cloud for technical reasons, and commercial ones, too,” Ranjan says. “For high-end systems, the pricing difference between ground and cloud can be very, very steep.”

While chip design and manufacturing have not changed much, analysts say that all semiconductor companies have tight partnerships with cloud providers. Together, for instance, they have designed and built specialized HPC cloud services to accommodate some workloads for this very important vertical.

George Westerman, a senior lecturer at MIT Sloan School of Management and founder of the Global Opportunity Initiative, notes that the decision process to run engineering designs on-premises or on HPC clouds is the same for any enterprise: cost of access, cost of delays to data transmission, and cybersecurity concerns.

HPC clouds from mainstream providers and chip design services such as Cadence, Synopsis, and Marvell are in essence industry clouds for the semiconductor industry. The only distinction is that chipmakers work directly with their manufacturing partners or fabs to move on-premises engineering designs for producing product.

“The semiconductor side is larger than what the cloud side can handle today,” says Risto Puhakka, director of products at TechInsights, a technology manufacturing consulting firm in San Jose, Calif. “Those data flows are incredibly massive and they create a dedicated pipeline to move that data to TSMC to make the masks for their wafer processing.”

Transforming IT

Meanwhile, as Ranjan acquires and nurtures more engineering talent to produce the best products, he is also transforming the digital infrastructure of the company to meet business goals — using the cloud as much as possible. For example, Ranjan says, AMD recently moved its SAP applications to a public cloud.

The CIO is also tasked with ensuring AMD has a massive data repository and analytics to extend sufficient resources to his engineering team. Here, AMD has implemented a leading data lakehouse, automated applications, and AI algorithms on AWS, Microsoft Azure, Google Cloud Platform, and Oracle Cloud. All this to align with AMD’s C-suite aspirations to better enable HPC workloads for all cloud customers through chip advancements, something Ranjan is tackling by providing his engineers with state-of-the-art hybrid platforms on which to design the chips.

All seems to be flowing in a positive direction, Ranjan says.

“The bulk of computations happen from our large data centers in the US — one in Atlanta and the rest sprinkled around the world,” he says, adding that 54% of AMD’s server fleet is less than two years old. “We are very current. That enables not only very efficient computing but that’s a sweet spot for sustainability as well.”

The value of AI

As for business, the semiconductor industry has been on a roller coaster ride of supply and demand over the past decade. Most recently, the pandemic slowed the supply of materials, which in turn slowed the manufacturing process and led to a significant chip shortage. That shortage has abated as of late (except in automotive industry) as possible recession has slowed demand for consumer devices, PCs, and servers, Ranjan says.

But what has kept demand strong for companies such as AMD, Intel, and Nvidia is the ongoing growth of cloud hypervisors and, more recently, increasing desire for machine learning models and platforms such as ChatGPT.  

Ranjan’s designers are also big consumers of AI and those tools are steadily becoming integrated into AMD’s design process. In addition to highly specialized EDA tools from Cadence, Synopsis, and Siemens, the semiconductor workflow requires source code management systems and increasingly, AI.

“We are trying to supplement that environment with new AI technologies and tools that are available,” he says. “They are in different stages of deployment and some are developed internally and some partner with different AI vendors.”

Rising to the occasion

While Ranjan’s relationship with the cloud may be atypical, his core job is the same as CIOs at all enterprises, he says: aligning IT investment with the business needs and goals of the organization at large.

To do so, Ranjan believes CIOs needs to be a half step ahead of the business side in order to scale and support the company’s evolving directives and to provide the infrastructure needed by the various constituencies of their companies, both business and technical.

It’s a balancing act, but the role of the CIO in the C-suite has evolved in step with the industry’s overall digital transformation. The IT department is not just a cost center anymore; quite the contrary, he says.

“The dream is that you create value for your company and you are aligned with your company’s business,” Ranjan says. “The first thing I look for is whether the solutions that I’m creating are 100% aligned with the changing business needs of the company. I aspire to be in that mode on a daily basis.”

Cloud Computing, Digital Transformation, High-Performance Computing, Technology Industry