AMD is reportedly advancing multi-chip module (MCM) design, commonly referred to as Chiplet technology, for consumer graphics processors. This technology may debut in gaming products with the upcoming UDNA architecture. Chiplets, which integrate multiple chips onto a single package, have previously been used in high-performance computing. AMD plans to extend this approach to gaming GPUs to overcome manufacturing and performance bottlenecks associated with monolithic (single-chip) designs.
Chiplet Technology in Graphics Processing #
The concept of MCM is not new to the graphics industry, but as chip sizes increase and process nodes shrink, the challenges of yield and cost for monolithic GPUs have intensified. AMD has extensive experience in this area, having utilized multi-chip designs in its Instinct series accelerators. For example, the Instinct MI200 used stacked graphics compute dies and high-bandwidth memory (HBM) to achieve efficient data transfer.
The subsequent Instinct MI350 series further optimized this structure, featuring 288GB of HBM3E memory and an 8TB/s memory bandwidth. Based on a 3nm process node, the MI350 includes 185 billion transistors. This series uses 10 chip modules with 2D hybrid bonding to enhance AI processing capabilities, providing a technological foundation for consumer products.
Addressing Challenges in Gaming GPUs #
Implementing multi-chip designs in gaming GPUs presents a significant challenge: increased latency. Frame rendering is highly sensitive to data transfer times, and long-distance data jumps between chips can degrade performance. AMD has addressed this by developing a patented solution to optimize inter-chip communication.
The patent describes a data structure circuit equipped with an intelligent switch that connects compute dies to memory controllers. This switch is similar to AMD’s Infinity Fabric interconnect but tailored for consumer GPUs, adapting to GDDR memory rather than HBM.
The switch minimizes latency by evaluating requests during graphics processing to determine if task migration or data replication is needed, making decisions within nanoseconds.
Architectural Improvements and Integration #
Specifically, this design configures the graphics compute dies as first- and second-level cache layers, similar to AI accelerators. The switch connects all compute dies, allowing access to a shared third-level cache or stacked static random-access memory (SRAM). This setup reduces reliance on global memory and provides a shared scratchpad area between chips, similar to 3D V-Cache technology but focused on graphics processing.
The patent also involves stacking dynamic random-access memory (DRAM), further enhancing MCM integration. AMD can utilize TSMC’s InFO-RDL bridge and a specific version of Infinity Fabric for packaging, creating a more compact overall structure.
UDNA Architecture and Market Impact #
This patent indicates AMD’s readiness for a multi-chip GPU ecosystem. The company is integrating software, including drivers and compilers, to support unified processing of gaming and AI tasks. The UDNA architecture is central to this integration, merging the RDNA gaming architecture with the CDNA compute architecture to provide a unified graphics processing platform.
According to current information, UDNA is expected to deliver a 20% improvement in rasterization performance over RDNA 4, double the ray tracing capabilities, and enhance AI features like image upscaling and frame generation. These improvements will be fully implemented in upcoming game consoles, such as the Xbox and PlayStation 6, and PC graphics cards.
Lessons from RDNA 3 #
AMD’s RDNA 3 architecture, particularly the Navi 31 GPU, already incorporates some multi-chip elements. The Navi 31 features six memory controller dies, 96MB of total Infinity Cache, a 384-bit memory bus, and support for up to 24GB of GDDR6 memory. Utilizing the Infinity Fabric interconnect, it achieves a peak bandwidth of 5.2TB/s.
While implemented in the RX 7900 series, which saw a 50% increase in performance per watt over its predecessor, this design also exposed inter-chip latency issues. The intelligent switch described in the new patent, combined with extra shared cache, is specifically designed to smooth data flow and address these challenges.
Scalability and Future Outlook #
A key advantage of multi-chip design is scalability. Traditional monolithic GPUs are limited by silicon size; top-tier products like the Navi 31, with approximately 57.7 billion transistors, are difficult to manufacture. MCM allows compute units to be spread across smaller dies, improving yield and reducing costs. For example, a large compute die can be split into three independent modules, each focused on a specific task, such as rendering or ray tracing, for easier optimization. The patent shows that this configuration can dynamically allocate workloads, ensuring gaming applications perceive the setup as a single GPU.
In practice, multi-chip GPUs require careful power management and heat distribution. AMD’s Instinct MI350, with a power consumption of 1400W, requires liquid cooling for stability. Gaming products may use similar strategies, targeting 400–600W to balance performance and efficiency. Software support is also critical; AMD’s ROCm platform has expanded to consumer use, allowing developers to create cross-platform applications using a unified architecture.
Looking ahead, the launch of the UDNA architecture marks a strategic shift for AMD in graphics technology. The company plans to produce top-tier products on a 3nm node, such as flagship GPUs based on the N3E process, offering higher transistor density. AMD’s multi-chip approach may provide a competitive edge in AI-enhanced graphics, such as optimizing texture compression or anti-aliasing through machine learning.
The shift toward multi-chip GPU design is moving from experimental to practical. AMD has used patent innovation to solve key obstacles, paving the way for consumer products. This technology is applicable not only to desktop graphics cards but also to laptops and embedded systems, driving the entire industry toward a modular transformation. As UDNA is gradually revealed, the performance boundaries of gaming GPUs will continue to expand, offering a more efficient computing experience.