AMD seals another major cloud deal as Oracle adopts thousands of Instinct MI300X GPUs to power new AI supercluster

October 12, 2024

0 74 2 minutes read

AMD seals another major cloud deal as Oracle adopts thousands of Instinct MI300X GPUs to power new AI supercluster

AMD’s Instinct MI300X is an incredibly powerful AI accelerator, and major cloud companies are starting to integrate it into their infrastructure to support intensive AI workloads.

Vultr recently announced that it has ordered “thousands” of MI300X units, and now Oracle Cloud Infrastructure (OCI) says it has acquired AMD’s hardware for its new OCI Compute Supercluster instance, BM.GPU.MI300X.8.

The new supercluster is designed for massive AI models containing billions of parameters and supports up to 16,384 GPUs in a single cluster. This setup uses the same high-speed technology used by other OCI accelerators, enabling large-scale AI training and inference with the memory capacity and throughput needed for the most demanding tasks. Its configuration makes it particularly suitable for LLMs and complex deep learning operations.

Pre-production testing

“AMD Instinct MI300X and ROCm open software continue to gain momentum as trusted solutions for powering the most critical OCI AI workloads,” said Andrew Dieckmann, corporate vice president and general manager, Data Center GPU Business, AMD. “As these solutions continue to expand into growing AI-intensive markets, the combination will benefit OCI customers with high performance, efficiency and greater system design flexibility.”

Oracle says that testing the MI300X as part of its pre-production efforts validated the GPU’s performance in real-world scenarios. For the Llama 2 70B model, the MI300X achieved a “time to first token” latency of 65 milliseconds and efficiently scaled to generate 3,643 tokens across 256 concurrent user requests. In another test with 2,048 input and 128 output tokens, it delivered an end-to-end latency of 1.6 seconds, which closely matched AMD’s own benchmarks.

The OCI BM.GPU.MI300X.8 instance features 8 AMD Instinct MI300X accelerators, delivering 1.5 TB of HBM3 GPU memory with a bandwidth of 5.3 TB/s, combined with 2 TB of system memory and 8 x 3 .84 TB NVMe storage. Oracle will offer the bare-metal solution for $6 per GPU/hour.

“The inference capabilities of AMD Instinct MI300X accelerators add to OCI’s extensive selection of high-performance bare metal instances to remove the overhead of virtualized compute often used for AI infrastructure,” said Donald Lu, senior vice president of software development at Oracle Cloud Infrastructure. “We are excited to offer more choice to customers looking to accelerate AI workloads at a competitive price.”