Ming-Chi Kuo: NVIDIA Ecosystem Integration to Drive 10-Fold LPU Production Surge, Major Impact on PCB Supply Chain

NVIDIA's incorporation of Groq LPU technology into its Rubin platform is triggering a profound transformation at the supply chain level. At the NVIDIA GTC conference, CEO Jensen Huang announced the launch of the Nvidia Groq 3 LPU chip, formally integrating it into the Vera Rubin platform architecture as a core inference acceleration component for next-generation AI data centers. Renowned Apple supply chain analyst Ming-Chi Kuo subsequently released a supply chain investigation report, indicating that following NVIDIA's investment in Groq, LPU shipment forecasts have been significantly revised upwards. Combined shipments for 2026 to 2027 are projected to reach 4 to 5 million units, representing a magnitude of growth approximately ten times greater than historical annual production volumes. Kuo identifies two core drivers behind this explosive growth: first, the deep integration of LPUs with the NVIDIA CUDA ecosystem significantly lowers development barriers; second, the rapid expansion of demand for ultra-low latency inference scenarios such as AI agents, real-time consumer applications, and physical AI. He also pointed out that the large-scale mass production of LPU/LPX racks will have a significant impact on the PCB supply chain, with WUS Printed Circuit expected to be a key beneficiary.

Huang's GTC Announcement: LPU Officially Becomes the Seventh Pillar of Rubin Platform During the GTC keynote, Jensen Huang detailed how NVIDIA is integrating the IP technology acquired from last year's purchase of Groq into the Rubin platform. The Nvidia Groq 3 LPU, an inference acceleration chip, becomes the seventh core building block of the Rubin platform, following the Rubin GPU, Vera CPU, NVLink 6 scale-up switch, ConnectX 9 smart NIC, Bluefield 4 data processing unit, and Spectrum-X scale-out switch. From a technical architecture perspective, the Groq 3 LPU follows a distinct path compared to mainstream AI accelerators. While most AI accelerators rely on HBM for working memory, each Groq 3 LPU incorporates 500MB of SRAM—the same memory type used in CPU and GPU caches. Although this capacity is far lower than the 288GB of HBM4 equipped in Rubin GPUs, its bandwidth reaches an impressive 150 TB/s, vastly exceeding the latter's 22 TB/s HBM bandwidth. For AI decoding operations highly sensitive to bandwidth, the Groq 3's ultra-high bandwidth provides significant advantages in inference applications, particularly suited for deploying cutting-edge AI models that require high batch sizes, low latency, and highly interactive outputs.

Supply Chain Investigation: 2026-2027 Shipments Projected at 4-5 Million Units According to Ming-Chi Kuo's latest supply chain investigation, LPU shipment forecasts have been substantively upgraded following NVIDIA's investment in Groq. He projects total LPU shipments for 2026-2027 to be 4 to 5 million units, with 30% to 40% in 2026 and 60% to 70% in 2027. Compared to historical annual production, this volume signifies a leap of approximately tenfold in magnitude. At the rack level, NVIDIA plans to increase LPU density per rack from 64 units to 256 units to maintain ultra-low latency during the inference decoding phase, while also addressing the expanding KV cache demands from long-context inference. Kuo anticipates that the new rack architecture will enter large-scale mass production between Q4 2026 and Q1 2027, with rack shipments expected to jump from 300-500 units in 2026 to 15,000-20,000 units in 2027.

Ecosystem Integration is Key: Three Technical Nodes Determine Deployment Pace Kuo highlights that the rapid growth in LPU demand fundamentally stems from its deep integration with the NVIDIA ecosystem. Integration with NVIDIA CUDA significantly lowers application development and deployment barriers, allowing developers to utilize LPU computing power without reconstructing existing workflows. Concurrently, the rapid expansion of ultra-low latency inference scenarios, such as AI agents (e.g., coding agents), real-time consumer applications, and physical AI, is further pulling the LPU demand curve upward. He also lists three critical technical integration nodes to monitor closely: First, at the network architecture level, whether rack-level interconnect can achieve seamless integration via NVLink Fusion and RealScale; second, at the developer interface level, whether Nvidia NIM will allow developers to deploy workloads directly without distinguishing between GPU and LPU; third, at the compiler level, whether TensorRT-LLM can support the LPU's "compile-first" architecture. Kuo believes the progress pace of these three integrations will directly determine the speed and depth of LPU's large-scale deployment.

PCB Supply Chain Enters New Cycle: WUS Printed Circuit Could Be Core Beneficiary Kuo specifically emphasizes that the large-scale mass production of LPU/LPX racks holds significant implications for the PCB supply chain. He notes that LPU/LPX racks represent the first large-scale commercial deployment of M9-grade CCL (copper-clad laminate) materials, with WUS Printed Circuit playing a key role in this supply chain. M9-grade CCL materials impose extremely high manufacturing process requirements, involving technical breakthroughs in quartz glass fabric treatment for high-layer-count boards. Kuo believes that if LPU/LPX racks successfully ramp up volume, it will not only make a substantive contribution to WUS's 2027 performance but also validate the company's technical capabilities in this high-end manufacturing sector, potentially catalyzing a new growth cycle for the entire PCB industry.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Tiger Brokers

Ming-Chi Kuo: NVIDIA Ecosystem Integration to Drive 10-Fold LPU Production Surge, Major Impact on PCB Supply Chain

Most Discussed