CXL Technology Set to Transform AI Storage Infrastructure with Major Players Driving Adoption

According to analysis, CXL memory pooling solutions can significantly enhance storage system efficiency, potentially reshaping the hardware composition of memory in future AI computing infrastructure. It is believed that the relevant hardware and software for CXL memory pooling are gradually maturing, with leading manufacturers accelerating their deployment efforts. Major companies are continuously promoting memory pooling solutions to optimize AI inference performance. As demand for AI inference grows, CXL memory pooling solutions are expected to expand their market presence, offering substantial benefits across the industry chain.

CXL solutions can optimize storage system efficiency, potentially leading to a restructuring of future AI storage architectures. Some investors may still underestimate the need for memory capacity expansion and storage architecture optimization driven by AI inference. The CXL (Compute Express Link) memory pooling solution supports unified addressing, scheduling, and transparent access to memory resources across CPUs, GPUs, and other computing accelerators. This enables the consolidation and unified management of memory resources, supporting larger-scale, higher-concurrency training and inference tasks for large models.

From a capacity perspective, AI inference demands for data storage—such as context caching and model weights—continue to increase. However, current server memory upgrades are constrained by the number of available slots and the capacity limits of individual memory modules. Additionally, existing storage architectures face inefficiencies in scheduling. Frequent migration of model parameters and activation values between HBM, DRAM, and SSD—coupled with significant bandwidth disparities and a lack of underlying direct-connect protocols with unified memory semantics—can lead to increased latency, wasted link bandwidth, and reduced throughput. Different tasks also place varying demands on computing and memory resources, and static resource allocation methods may result in either computational waste or memory bottlenecks.

To address these challenges, CXL memory pooling solutions are expected to expand memory capacity for AI computing facilities and provide more flexible resource allocation, thereby enhancing AI model training and inference capabilities. At the same time, CXL technology is anticipated to significantly reduce the total cost of ownership (TCO) for data center systems by optimizing memory configurations.

Some investors may view memory pooling solutions as still in early stages of maturity. However, it is observed that related hardware and software for CXL memory pooling are steadily improving, with major manufacturers accelerating their strategic layouts. In terms of interconnect protocols, the CXL 4.0 specification is set for release in November 2025, offering a data rate of 128 GT/s—double that of CXL 3.0. NVIDIA is expected to continue advancing the CXL technology ecosystem, having acquired the core team and technology licenses of Enfabrica in September 2025. According to NVIDIA's official website, its Vera CPU supports the CXL protocol.

Domestic server manufacturers have already introduced CXL memory pooling solutions. In 2025, Alibaba Cloud announced at its Yunqi Conference the launch of the world’s first PolarDB database-dedicated server based on CXL 2.0 Switch technology. In December 2025, Inspur Information introduced its Yuan Nao server CXL memory expansion solution. Based on the Yuan Nao server NF5280G7, the solution incorporates a CXL memory expansion card alongside 24 local DRAM modules. As related hardware and software continue to mature and leading manufacturers push for adoption, the penetration rate of CXL technology is expected to rise rapidly. According to TechInsights projections, CXL’s share of the server DRAM market is forecast to grow from nearly zero in 2024 to approximately 15% by 2030. Support for CXL functionality is likely to become a standard feature in servers, accelerating the maturation of the industrial ecosystem.

Major manufacturers are continuing to innovate in memory pooling solutions to better meet AI inference demands. In March 2026, Inspur’s Yuan Nao server operating system KOS introduced MantaKV, a "storage-transmission integrated" KVCache management system based on CXL memory pooling technology. This system centrally stores massive KVCache data generated by P-nodes in CXL pooled shared memory, making it directly available for decoding by D-nodes without retransmission. It also serves as a globally accessible persistent cache, eliminating the need to offload data to local SSDs on P-nodes. By merging two separate data transfers into a single write operation, the solution reduces transmission redundancy and improves model inference efficiency.

Also in March 2026, Peking University, in collaboration with Alibaba Cloud, proposed using CXL memory pools to store Engram. Integrating CXL-based Engram memory pools into the SGLang framework achieved end-to-end performance close to that of local DRAM, offering a scalable and cost-effective storage solution for large language models incorporating Engram. It is emphasized that ongoing efforts by major manufacturers to optimize AI inference performance through memory pooling solutions are expected to create further opportunities for CXL memory pooling as AI inference demand increases, bringing profound benefits to the industry chain.

Risks include potential delays in AI implementation, slower-than-expected technological iteration, and setbacks in domestic adoption progress.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Tiger Brokers

CXL Technology Set to Transform AI Storage Infrastructure with Major Players Driving Adoption

Most Discussed