Against the backdrop of rapid AI large model development, GPU memory capacity has become a critical bottleneck constraining model training and inference efficiency. Insufficient memory when running single cards, or the deployment pattern of one model per card in light-load scenarios, all reflect the market's urgent demand for efficient, low-cost expansion of memory resources.
Recently, FOURTH PARADIGM (06682) officially launched the "Virtual VRAM" plug-in virtual graphics memory expansion card. This product achieves elastic expansion of GPU computing resources by transforming physical memory into a dynamically schedulable graphics memory buffer pool. FOURTH PARADIGM founder Dai Wenyuan and co-founder and Chief Scientist Chen Yuqiang attended the launch event.
Traditional GPU memory capacity is fixed and expansion costs are high, limiting the scaling of AI models and multi-task concurrent capabilities. Users often can only alleviate pressure by purchasing higher-end graphics cards or multi-card parallel processing, leading to sharp increases in investment costs.
FOURTH PARADIGM's "Virtual VRAM" innovatively constructs a high-speed data channel between graphics memory and system memory, virtualizing memory for graphics memory use. This is equivalent to configuring a flexibly schedulable "storage room" for the original "fixed preparation counter," thereby breaking through graphics memory capacity limitations without significantly altering hardware architecture.
According to FOURTH PARADIGM co-founder and Chief Scientist Chen Yuqiang, after using this expansion card, the virtual graphics memory capacity of a single graphics card can be expanded up to 256GB. Taking the NVIDIA H20 graphics card as an example, its native memory is 96GB, and after expansion it equals the physical memory capacity of 10 NVIDIA RTX 4090s or 6 NVIDIA A100s. Users can run larger-scale AI training and inference tasks at near-native large memory performance without replacing hardware.
The product primarily targets two application scenarios: first, when large models running on single cards have insufficient memory, users can continue completing tasks by utilizing memory resources, thereby avoiding purchasing additional graphics cards and significantly reducing costs; second, in light-load scenarios, multiple models can be deployed on the same GPU, achieving dynamic resource scheduling and effectively solving resource idling problems caused by "one model per card."
Additionally, "Virtual VRAM" features high compatibility and deployment convenience. The product supports physical machines, Docker containers, and Kubernetes cloud-native environments. Users can achieve plug-and-play functionality without modifying existing code or recompiling, significantly reducing deployment complexity and secondary development costs.
Analysis suggests that as AI model quantities and parameter scales continue rapid growth, memory capacity has become a key factor constraining enterprise AI capability building and cost control. FOURTH PARADIGM's newly released product is expected to provide enterprises with more cost-effective computing power expansion solutions, helping users achieve further cost reduction and efficiency improvement while maintaining high performance.
In the future, FOURTH PARADIGM plans to collaborate with more memory manufacturers to continuously promote optimization and popularization of AI infrastructure.