May 26, 2022

At its annual GTC for AI Developers Conference, Nvidia announced its next-generation Hopper GPU architecture and Hopper H100 GPU, as well as a new data center chip that combines a GPU with a powerful CPU that Nvidia calls “Grace CPU Superchip. (Not to be confused with the Grace Hopper superchip).

GP H100

As with Hopper, Nvidia is launching a number of new and updated technologies, but for AI developers, perhaps most important is the architecture’s focus on the Transformer model, a machine learning technique required for many use cases that supports models like GPT. become. -3. and Esbert. The new Transformer mechanism in the H100 chip promises to speed up model training by up to six times, and since this new architecture also includes Nvidia’s new NVLink switch system for connecting multiple nodes, large server clusters based on these chips can be integrated into large networks. , can be extended to support. at a lower cost.

“It can take months to train the largest AI models on modern computing platforms,” Nvidia’s Dave Salvator wrote in today’s announcement. “It’s too slow for companies. AI, HPC, and data analytics are becoming increasingly complex, with some models, such as the big language model, reaching trillions of parameters. The NVIDIA Hopper architecture was built to handle growing networks and datasets with massive compute and fast memory to accelerate these next generation AI workloads. ,

The new Transformer engine uses client-side tensor cores that can combine 8-bit precision and 16-bit half precision as needed while maintaining precision.

Nvidia Hopper GPU

image credit: NVIDIA

“The challenge for the model is to intelligently manage precision to maintain accuracy while achieving performance for smaller and faster number formats,” explains Salvatore. “The Transformer Engine makes this possible with a custom heuristic tuned by NVIDIA that dynamically chooses between FP8 and FP16 calculations and automatically rebuilds and scales between those precision values ​​at each level.”

The H100 GPU will have 80 billion transistors and will be built using TSMC’s 4nm process. It promises 1.5x to 6x speedups over the 2020 Ampere A100 data center GPU using TSMC’s 7nm process.

In addition to the Transformer engine, the GPU will also include a new confidential computing component.

Grace (Hooper) Super Chips

grace superchip

grace superchip

Grace CPU Superchip is Nvidia’s first dedicated processor for the data center. The Arm Neoverse-based chip will have 144 cores with a memory bandwidth of 1 terabyte per second. In fact, it combines two Grace processors connected via an NVLink interconnect, which resembles the architecture of the Apple M1 Ultra.

The new CPU, which will use LPDDR5X memory quickly, will be available in the first half of 2023 and promises to deliver up to 2x the performance of traditional servers. Nvidia estimates that this chip will score 740 on the SPECrate®2017_int_base test, effectively putting it in direct competition with AMD and Intel’s high-end data center processors (although some of them get higher scores, they themselves have lower performance) . watt))

“A new type of data center has emerged: AI factories that process and improve mountains of data for intelligent purposes,” said Jensen Huang, founder and CEO of NVIDIA. “Grace CPU Superchip delivers the highest performance, memory bandwidth and NVIDIA software platform in a single chip and will shine as the world’s AI infrastructure processor.”

In many ways, this new chip is a natural evolution of the Grace Hopper superchip and Grace processor that the company announced last year (yes, those names are confusing, especially since Nvidia calls the Grace Hopper superchip Nvidia Grace). The Grace Hopper Superchip integrates the CPU and GPU into a single system on a chip. The system, due to launch in the first half of 2023, will feature a 600GB memory GPU for larger models, and Nvidia promises 30 times more memory bandwidth than traditional server GPUs. These chips, according to Nvidia, are designed for “huge” AI and high-performance computing.

The Grace CPU is based on the Superchip Arm v9 architecture and can be configured as a standalone CPU system or for servers with up to eight silo-based GPUs.

The company says it works with “leading customers in HPC, supercomputing, hyperscale and cloud computing,” so there’s a good chance these systems will arrive at your cloud provider next year.

Leave a Reply

Your email address will not be published.