AMD is falling the APU design in favor of GPU-Alleen MI355X for advanced AI infrastructure
- Advertisement -
- Advertisement -
- MI355X leads AMD’s new MI350 series with 288 GB of memory and full liquid-cooled performance
- AMD is falling apu integration, aimed at GPU flexibility on rack scale
- FP6 and FP4 data types emphasize MI355X’s inference-optimized design choices
AMD Has unveiled its new MI350X and MI355X GPUs for AI -Workloads during the 2025 Advancing AI event and offers two options built on the latest CDNA 4 architecture.
Although both share a common platform, the MI355X is apart as the higher performance, liquid -cooled variant designed for demanding, large -scale implementations.
The MI355X supports up to 128 GPUs per rack and delivers a high transit for both training and inference workload. It has 288 GB HBM3E memory and 8TB/s memory bandwidth.
GPU-Alleen design
AMD claims that the MI355X up to 4 times the AI calculation delivers and 35 times the inference performance of its previous generation, thanks to architectural improvements and a switch to the N3P process of TSMC.
Inside, the chip contains eight calculation life with 256 active accounting units and a total of 185 billion transistors, which marks an increase of 21% compared to the previous model. Each dice connects via redesigned I/O tiles, lowered from four to two, to double internal bandwidth while the power consumption is lowered.
The MI355X is a GPU all-design and lets the CPU-GPU APU approach used in the MI300A. AMD says that this decision better supports modular implementation and flexibility on rack scale.
It connects to the host via a PCIE 5.0 X16 interface and communicates with Peer GPUs with seven Infinity Fabric links, which reaches more than 1 TB/s in GPU-Naar-GPU bandwidth.
Each HBM pile stuff with 32 MB Infinity Cache, and architecture supports newer sizes with a lower precision such as FP4 and FP6.
The MI355X carries out FP6 operations against FP4 rates, highlighting a function AMD as favorable for inference-heavy workload. It also offers 1.6 times the HBM3E memory capacity of Nvidia‘s GB200 and B200, although memory tape width remains comparable. AMD claims a 1.2x to 1.3x inference performance ahead of Nvidia’s top products.
The GPU pulls up to 1,400 W in its liquid -cooled form and delivers a higher performance density per rack. AMD says this improves TCO by allowing users to scale without expanding the physical footprint.
The chip fits in standard OAM modules and is compatible with UBB platform servers, which accelerates the implementation.
“The world of AI does not delay – and we don’t,” said Vamsi Boppana, SVP, AI Group. “At AMD we not only keep pace, we lay the bar. Our customers demand real, implementable solutions that become scale, and that is exactly what we deliver with the AMD Instinct MI350 series. With advanced performance, massive memory band width and flexible, open infrastructure, we empowered the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry in the industry. is. “
AMD is planning to launch its Instinct MI400 series in 2026.
Maybe you like it too
- Advertisement -