Following the recent release of the Super Ethernet Consortium (UEC) Technical Specification 1.0, AMD announced at its Advancing AI 2025 summit that its Pensando Pollara 400G AI NIC, supporting up to 400G bandwidth and compliant with UEC specifications, is now entering the deployment phase.
AMD Advancing AI 2025 Highlights #
AMD Advancing AI 2025 is a significant event for AMD in the artificial intelligence field, showcasing its comprehensive strategy across hardware, software, and system solutions.
Hardware Product Releases:
- Instinct MI350 Series GPUs: This includes the MI350X and MI355X, built on the CDNA 4 architecture. They support new data formats like FP4 and FP6, offering a 35x improvement in inference performance compared to the previous generation. The MI355X boasts 288GB of HBM3E memory, capable of running models with up to 520B parameters on a single GPU.
- Fifth-Generation EPYC Processors: These processors, when combined with the Instinct MI350 series GPUs, deliver powerful AI inference performance.
- Pensando Pollara 400G NIC: This network interface card supports up to 400Gbps of bandwidth and features UEC-compliant functionalities such as intelligent packet spraying, out-of-order reordering, and selective retransmission.
Pensando Pollara 400G NIC Explained #
The Pensando Pollara 400G NIC is a network card based on the Ultra Ethernet standard, designed by AMD’s Pensando division (acquired in April 2022 for $1.9 billion). It’s expected to significantly boost communication performance in AI data centers.
Ultra Ethernet (UE) is an open networking standard developed by the Ultra Ethernet Consortium (UEC). It aims to meet the increasingly stringent bandwidth, latency, and scalability demands of AI and HPC clusters. For years, InfiniBand has dominated supercomputer clusters due to its ultra-low latency. However, Ultra Ethernet is gradually emerging as a potential alternative, promising to narrow the performance gap with InfiniBand while retaining Ethernet’s traditional advantages of lower cost, higher flexibility, and a vast ecosystem.
UE employs advanced technologies like “packet spraying, intelligent congestion control, and flexible packet ordering” to efficiently handle the massive and complex data flows of AI/HPC workloads. The goal is to reach speeds of 800Gb/s and higher in the future.
The Pensando Pollara 400G NIC connects to servers using a PCIe Gen5.0 x16 interface and supports RDMA (Remote Direct Memory Access) communication, which allows data to be directly transferred between server memory or GPUs without CPU intervention.
Key Technical Specifications of Pensando Pollara 400G NIC:
Feature | Specification |
---|---|
Bandwidth | Up to 400Gbps |
Interface | PCIe Gen5.0 x16 |
AI Acceleration | Intelligent Packet Spraying, In-Order Delivery (Message to GPU), Selective Retransmission, Path-Aware Congestion Avoidance |
Programmability | P4 programmable engine (fully programmable NIC) |
Data Transfer | RDMA (direct memory/GPU transfer) |
Standards Compliance | Ultra Ethernet Consortium (UEC) 1.0 |
Dedicated AI Acceleration Features: #
The Pensando Pollara 400G NIC includes dedicated AI acceleration features such as:
- Intelligent Packet Spraying: Intelligently distributes packets across multiple available network paths to optimize bandwidth utilization and avoid local bottlenecks.
- In-Order Delivery (Message to GPU): Ensures that messages and data arrive at the GPU in the order they were sent, which is crucial for certain AI algorithms.
- Selective Retransmission: Only retransmits packets that were actually lost or corrupted, rather than retransmitting the entire data, which helps improve network efficiency.
- Path-Aware Congestion Avoidance: An advanced congestion avoidance mechanism that can identify the status of different paths in the network to make optimal routing decisions.
Furthermore, the Pensando Pollara 400G NIC features a P4 programmable engine, making it a fully programmable network card. This allows large customers, especially cloud service providers, to customize their advanced congestion control and data flow management algorithms according to their specific infrastructure needs.
AMD announced that Oracle Cloud Infrastructure (OCI) is among the first hyperscale cloud service providers to deploy this NIC, and will simultaneously adopt the AMD Instinct MI350X series GPUs. Besides Oracle, other enterprises planning large-scale Instinct GPU deployments are expected to follow suit rapidly, promoting the widespread adoption of the Ultra Ethernet hardware ecosystem. Deliveries of this NIC to interested customers have already begun.
According to the plan, this hardware will be extensively deployed at OCI starting in the second half of this year. Oracle intends to use it to build a Z-scale AI cluster comprising 131,072 Instinct MI355X GPUs, supporting customers in large-scale AI training and inference.