Panmnesia Pushes CXL + NVLink/UALink for AI Superclusters

Table of Contents

South Korea–based Panmnesia, a company focused on CXL (Compute Express Link) memory technologies, is pushing for a unified memory and interconnect design to power the next generation of AI superclusters. The company believes that future AI workloads demand both GPU node memory sharing and fast inter-GPU networking, achieved by combining CXL with UALink/NVLink architectures.

A Technical Report on the Future of AI Infrastructure
#

Panmnesia’s CEO, Dr. Myoungsoo Jung, has released a 56-page technical report titled “Compute Can’t Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI Infrastructure.”

The report highlights:

The growth of AI models and why current compute-centric infrastructures struggle to scale.
The limitations of rigid GPU-centric architectures, such as communication overhead and low utilization.
The role of emerging interconnect and memory technologies—CXL, NVLink, UALink, and HBM—in solving these challenges.

Jung notes:

“No single fixed architecture can fully satisfy all the compute, memory, and networking demands for large-scale AI. The best solution is to integrate CXL with accelerator-focused interconnects like NVLink and UALink.”

Breaking Down the Report: Three Key Parts
#

Trends in AI and Data Center Architectures
Explains how workloads like chatbots, image generation, and video processing rely on sequence models (RNNs → LLMs) and why current infrastructures bottleneck performance.
CXL Composable Architectures
Shows how CXL 3.0—with multi-level switching, advanced routing, and memory coherence—can reshape data center memory architectures. Panmnesia has already developed a CXL 3.0–compliant prototype, tested on RAG and deep learning recommendation models (DLRMs).
Beyond CXL: Hybrid Link Architectures
Introduces “CXL over XLink”, where CXL handles memory pooling and coherence, while XLink (UALink + NVLink) delivers low-latency accelerator-to-accelerator communication.

Why Combine CXL and XLink?
#

CXL → Expands memory capacity, provides system-wide coherence, and enables disaggregated, composable memory pools.
XLink (UALink + NVLink) → Optimized for direct GPU-to-GPU transfers with ultra-low latency.
Unified Approach (CXL over XLink) → Bridges the gap, creating:
- Accelerator-centric clusters for rapid intra-cluster GPU communication.
- Tiered memory architectures with local high-performance memory and scalable pooled memory.

Toward Scalable AI Superclusters
#

Dr. Jung envisions a scalable tiered memory hierarchy for AI superclusters:

Tier 1: High-performance local memory managed by XLink + coherence-centric CXL.
Tier 2: Composable memory pools enabled by CXL for large-scale data handling.

This hybrid model is designed to meet the unique performance needs of LLMs, inference, RAG, and recommendation systems—paving the way for future-proof AI infrastructure.

Final Thoughts
#

Panmnesia’s work underscores a growing industry realization: AI progress depends as much on memory and interconnects as it does on compute.

By unifying CXL’s composability with NVLink/UALink’s accelerator speed, Panmnesia is proposing a new blueprint for AI data centers that could define the next generation of superclusters powering large-scale AI applications.