Skip to main content

Panmnesia Pushes CXL + NVLink/UALink for AI Superclusters

·450 words·3 mins
Panmnesia CXL NVLINK UALink AI Superclusters Memory Interconnect
Table of Contents

South Korea–based Panmnesia, a company focused on CXL (Compute Express Link) memory technologies, is pushing for a unified memory and interconnect design to power the next generation of AI superclusters. The company believes that future AI workloads demand both GPU node memory sharing and fast inter-GPU networking, achieved by combining CXL with UALink/NVLink architectures.


A Technical Report on the Future of AI Infrastructure
#

Panmnesia’s CEO, Dr. Myoungsoo Jung, has released a 56-page technical report titled “Compute Can’t Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI Infrastructure.”

The report highlights:

  • The growth of AI models and why current compute-centric infrastructures struggle to scale.
  • The limitations of rigid GPU-centric architectures, such as communication overhead and low utilization.
  • The role of emerging interconnect and memory technologies—CXL, NVLink, UALink, and HBM—in solving these challenges.

Jung notes:

“No single fixed architecture can fully satisfy all the compute, memory, and networking demands for large-scale AI. The best solution is to integrate CXL with accelerator-focused interconnects like NVLink and UALink.”


Breaking Down the Report: Three Key Parts
#

  1. Trends in AI and Data Center Architectures
    Explains how workloads like chatbots, image generation, and video processing rely on sequence models (RNNs → LLMs) and why current infrastructures bottleneck performance.

  2. CXL Composable Architectures
    Shows how CXL 3.0—with multi-level switching, advanced routing, and memory coherence—can reshape data center memory architectures. Panmnesia has already developed a CXL 3.0–compliant prototype, tested on RAG and deep learning recommendation models (DLRMs).

  3. Beyond CXL: Hybrid Link Architectures
    Introduces “CXL over XLink”, where CXL handles memory pooling and coherence, while XLink (UALink + NVLink) delivers low-latency accelerator-to-accelerator communication.


Why Combine CXL and XLink? #

  • CXL → Expands memory capacity, provides system-wide coherence, and enables disaggregated, composable memory pools.
  • XLink (UALink + NVLink) → Optimized for direct GPU-to-GPU transfers with ultra-low latency.
  • Unified Approach (CXL over XLink) → Bridges the gap, creating:
    • Accelerator-centric clusters for rapid intra-cluster GPU communication.
    • Tiered memory architectures with local high-performance memory and scalable pooled memory.

Panmnesia Technical Report diagram
Panmnesia Technical Report diagram


Toward Scalable AI Superclusters
#

Dr. Jung envisions a scalable tiered memory hierarchy for AI superclusters:

  • Tier 1: High-performance local memory managed by XLink + coherence-centric CXL.
  • Tier 2: Composable memory pools enabled by CXL for large-scale data handling.

This hybrid model is designed to meet the unique performance needs of LLMs, inference, RAG, and recommendation systems—paving the way for future-proof AI infrastructure.


Final Thoughts
#

Panmnesia’s work underscores a growing industry realization: AI progress depends as much on memory and interconnects as it does on compute.

By unifying CXL’s composability with NVLink/UALink’s accelerator speed, Panmnesia is proposing a new blueprint for AI data centers that could define the next generation of superclusters powering large-scale AI applications.


Related

Nvidia芯片服务器过热
·50 words·1 min
UALink NVLINK
九大巨头,正式成立UALink联盟
·306 words·2 mins
UALink NVLINK
2025年CXL将开启内存扩展新时代
·95 words·1 min
CXL Memory Expansion