Google warns Nvidia with presentation of Trillium, its rival AI chip, while promising to release H200 Tensor Core GPUs within days
- Trillium offers 4x training boost and 3x consequence improvement over TPU v5e
- Improved HBM and ICI bandwidth for LLM support
- Scalable up to 256 chips per pod, ideal for extensive AI tasks
Google Cloud has launched its latest TPU, Trillium, the sixth generation model in its custom AI chip family, designed to power advanced AI workloads.
First announced in May 2024, Trillium is designed to perform large-scale training, tuning, and inference with improved performance and cost-efficiency.
The release is part of Google Cloud’s AI Hypercomputing infrastructure, which integrates TPUs, GPUs and CPUs in addition to open software to meet the increasing demands of generative AI.
A3 Ultra VMs are arriving soon
Trillium promises significant improvements over its predecessor, TPU v5e, with a fourfold improvement in training performance and an up to threefold increase in inference throughput. Trillium delivers twice the HBM capacity and doubled Interchip Interconnect (ICI) bandwidth, making it particularly suitable for large language models such as Gemma 2 and Llama, as well as for compute-intensive inference applications, including diffusion models such as Stable Diffusion XL.
Google is also keen to highlight Trillium’s emphasis on energy efficiency, with a claimed 67% increase compared to previous generations.
Google says its new TPU has shown significantly improved performance in benchmark tests, delivering four times faster training speeds for models like the Gemma 2-27b and Llama2-70B. For inference tasks, Trillium achieved three times greater throughput than TPU v5e, particularly excelling in models that require extensive compute resources.
According to Google, scaling is another strong point of Trillium. The TPU can connect up to 256 chips in a single high-bandwidth pod, expandable to thousands of chips within Google’s Jupiter data center network, providing near-linear scalability for extensive AI training tasks. With Multislice software, Trillium maintains consistent performance across hundreds of pods.
In connection with the arrival of Trillium, Google also announced the A3 Ultra VMs with Nvidia H200 Tensor Core GPUs. Scheduled for a preview this month, they will offer Google Cloud customers a powerful GPU option within the tech giant’s AI infrastructure.