CES 2026 showcases the arrival of the NVIDIA Rubin Platform, together with Azure’s confirmed readiness for deployment.
CES 2026 showcases the arrival of the NVIDIA Rubin platform, together with Azure’s confirmed readiness for deployment. Microsoft’s long-range datacenter technique was engineered for moments precisely like this, the place NVIDIA’s next-generation programs slot instantly into infrastructure that has anticipated their energy, thermal, reminiscence, and networking necessities years forward of the business. Our long-term collaboration with NVIDIA ensures Rubin suits instantly into Azure’s ahead platform design.
Constructing with function for the long run
Azure’s AI datacenters are engineered for the way forward for accelerated computing. That allows seamless integration of NVIDIA Vera Rubin NVL72 racks throughout Azure’s largest next-gen AI superfactories from present Fairwater websites in Wisconsin and Atlanta to future areas.
The latest NVIDIA AI infrastructure requires vital upgrades in energy, cooling, and efficiency optimization; nonetheless, Azure’s expertise with our Fairwater websites and a number of improve cycles over time demonstrates a capability to flexibly improve and increase AI infrastructure in keeping with developments in know-how.
Azure’s confirmed expertise delivering scale and efficiency
Microsoft has years of market-proven expertise in designing and deploying scalable AI infrastructure that evolves with each main development of AI know-how. In lockstep with every successive era of NVIDIA’s accelerated compute infrastructure, Microsoft quickly integrates NVIDIA’s improvements and delivers them at scale. Our early, large-scale deployments of NVIDIA Ampere and Hopper GPUs, related by way of NVIDIA Quantum-2 InfiniBand networking, have been instrumental in bringing fashions like GPT-3.5 to life, whereas different clusters set supercomputing efficiency information, demonstrating we will carry next-generation programs on-line sooner and with larger real-world efficiency than the remainder of the business.
We unveiled the primary and largest implementations of each NVIDIA GB200 NVL72 and NVIDIA GB300 NVL72 platforms, architected as racks into single supercomputers which practice AI fashions dramatically sooner, serving to Azure stay a best choice for patrons searching for superior AI capabilities.
Azure’s programs method
Azure is engineered for compute, networking, storage, software program, and infrastructure all working collectively as one built-in platform. That is how Microsoft builds a sturdy benefit into Azure and delivers value and efficiency breakthroughs that compound over time.
Maximizing GPU utilization requires optimization throughout each layer. Along with Azure with the ability to undertake NVIDIA’s new accelerated compute platforms early, Azure benefits come from the encircling platform as effectively: high-throughput Blob storage, proximity placement and region-scale design formed by actual manufacturing patterns, and orchestration layers like CycleCloud and AKS tuned for low-overhead scheduling at huge cluster scale.
Azure Increase and different offload engines clear IO, community, and storage bottlenecks so fashions scale easily. Sooner storage feeds bigger clusters, stronger networking sustains them, and optimized orchestration retains end-to-end efficiency regular. First occasion improvements reinforce the loop: liquid cooling Warmth Exchanger Models keep tight thermals, Azure {hardware} safety module (HSM) silicon offloads safety work, and Azure Cobalt delivers distinctive efficiency and effectivity for general-purpose compute and AI-adjacent duties. Collectively, these integrations guarantee the whole system scales effectively, so GPU investments ship most worth.
This programs method is what makes Azure prepared for the Rubin platform. We’re delivering new programs and establishing an end-to-end platform already formed by the necessities Rubin brings.
Working the NVIDIA Rubin platform
NVIDIA Vera Rubin Superchips will ship 50 PF NVFP4 inference efficiency per chip and 3.6 EF NVFP4 per rack, a 5 occasions soar over NVIDIA GB200 NVL72 rack programs.
Azure has already integrated the core architectural assumptions Rubin requires:
- NVIDIA NVLink evolution: The sixth-generation NVIDIA NVLink material anticipated in Vera Rubin NVL72 programs reaches ~260 TB/s of scale-up bandwidth, and Azure’s rack structure has already been redesigned to function with these bandwidth and topology benefits.
- Excessive-performance scale-out networking: The Rubin AI infrastructure depends on ultra-fast NVIDIA ConnectX-9 1,600 Gb/s networking, delivered by Azure’s community infrastructure, which has been purpose-built to help large-scale AI workloads.
- HBM4/HBM4e thermal and density planning: The Rubin reminiscence stack calls for tighter thermal home windows and better rack densities; Azure’s cooling, energy envelopes, and rack geometries have already been upgraded to deal with the identical constraints.
- SOCAMM2 pushed reminiscence growth: Rubin Superchips use a brand new reminiscence growth structure; Azure’s platform has already built-in and validated related reminiscence extension behaviors to maintain fashions fed at scale.
- Reticle sized GPU scaling and multi-die packaging: Rubin strikes to massively bigger GPU footprints and multi-die layouts. Azure’s provide chain, mechanical design, and orchestration layers have been pre-tuned for these bodily and logical scaling traits.
Azure’s method in designing for subsequent era accelerated compute platforms like Rubin has been confirmed over a number of years, together with vital milestones:
- Operated the world’s largest industrial InfiniBand deployments throughout a number of GPU generations.
- Constructed reliability layers and congestion administration strategies that unlock larger cluster utilization and bigger job sizes than rivals, mirrored in our capability to publish business main large-scale benchmarks. (E.g., multi-rack MLPerf runs rivals have by no means replicated.)
- AI datacenters co-designed with Grace Blackwell and Vera Rubin from the bottom as much as maximize efficiency and efficiency per greenback on the cluster degree.
Design ideas that differentiate Azure
- Pod trade structure: To allow quick servicing, Azure’s GPU server trays are designed to be shortly swappable with out requiring intensive rewiring, bettering uptime.
- Cooling abstraction layer: Rubin’s multi-die, excessive bandwidth parts require subtle thermal headroom that Fairwater already accommodates, avoiding costly retrofit cycles.
- Subsequent gen energy design: Vera Rubin NVL72 demand rising watt density; Azure’s multi-year energy redesign (liquid cooling loop revisions, CDU scaling, and excessive amp busways) ensures fast deployability.
- AI superfactory modularity: Microsoft, in contrast to different hyperscalers, builds regional supercomputers somewhat than singular megasites, enabling extra predictable international rollout of latest SKUs.
How co-design results in consumer advantages
The NVIDIA Rubin platform marks a significant step ahead in accelerated computing, and Azure’s AI datacenters and superfactories are already engineered to take full benefit. Years of co-design with NVIDIA throughout interconnects, reminiscence programs, thermals, packaging, and rack scale structure means Rubin integrates instantly into Azure’s platform with out rework. Rubin’s core assumptions are already mirrored in our networking, energy, cooling, orchestration, and pod trade design ideas. This alignment offers prospects fast advantages with sooner deployment, sooner scaling, and sooner impression as they construct the subsequent period of large-scale AI.