It's pretty wild how fast the AI world moves these days. One minute everyone's talking about how Blackwell is pushing boundaries, and the next, Nvidia rolls out something even bigger. At CES 2026 in Las Vegas on January 5, CEO Jensen Huang took the stage and officially unveiled the Vera Rubin AI computing platform.
This isn't just a new GPU—it's a full-blown supercomputing architecture built from the ground up for the next wave of AI, especially those super complex agentic systems and massive mixture-of-experts models that are starting to dominate everything.
“This is our first extreme co-designed six-chip platform.
Huang didn't hold back, calling it the company's first "extreme co-designed" six-chip platform. He named it after Vera Rubin, the astronomer who helped prove dark matter exists—kind of fitting for something that's meant to handle the "dark" and unseen complexities of future AI.
A Game-Changing Launch at CES 2026
The big news? It's already in full production, and partners will start rolling out systems based on it in the second half of 2026. Cloud giants like AWS, Google Cloud, Microsoft Azure, and Oracle are lined up to deploy it first, along with specialists like CoreWeave.

What makes this thing so special? Nvidia says the Rubin GPU alone packs about five times the inference performance of Blackwell in certain formats, and around 3.5 times for training. But the real magic happens when you look at the whole platform. For training huge MoE models—the ones that route queries to specialized "experts"—it supposedly needs just a quarter of the GPUs compared to Blackwell, and the cost per token for inference can drop by up to 10 times.
That's huge if you're a company burning billions on data centers. The platform is made up of six key chips working together like a well-oiled machine:
The Rubin GPU is the main powerhouse, with two dies on TSMC's 3nm process, packing hundreds of billions of transistors. It comes with next-gen HBM4 memory—up to 288GB per package and insane bandwidth around 22 TB/s.
The Vera CPU handles massive workloads. Nvidia's own Arm-based beast with 88 custom Olympus cores and support for up to 176 threads. It's optimized for orchestrating massive AI workloads.
More Read
Breaking Down the Six-Chip Architecture
The Rubin GPU: This is the powerhouse, with two dies on TSMC's 3nm process, packing hundreds of billions of transistors. It comes with next-gen HBM4 memory—up to 288GB per package and insane bandwidth around 22 TB/s. Nvidia's throwing in a third-generation Transformer Engine for better efficiency on those long-context AI tasks.
The Vera CPU: Nvidia's own Arm-based beast with 88 custom Olympus cores and support for up to 176 threads. It's optimized for orchestrating massive AI workloads, not general computing, and pairs perfectly with the GPUs via super-fast NVLink-C2C links.
Then you've got the networking side: NVLink 6 switches for blistering GPU-to-GPU communication (3.6 TB/s per GPU, and a full rack hits 260 TB/s—more than the entire internet's bandwidth, Huang joked), ConnectX-9 SuperNICs, BlueField-4 DPUs for smart data processing, and Spectrum-6 Ethernet switches with co-packaged optics for better efficiency and reliability.
All this ties into rack-scale systems like the Vera Rubin NVL72, which crams 72 Rubin GPUs, 36 Vera CPUs, tons of HBM4 and LPDDR5x memory into one liquid-cooled rack. It's designed to act as a single giant AI engine, no bottlenecks from partitioning models across machines.
Why This Platform Changes Everything for AI
Why does this matter so much? AI isn't just chatbots anymore. We're talking agentic AI that reasons step-by-step over long contexts, generates videos, solves multi-step problems—stuff that chews through tokens like crazy. Current setups struggle with memory and bandwidth as models hit trillions of parameters.
Vera Rubin tackles that head-on with better confidential computing (full encryption across the NVLink domain), improved reliability features, and that new Inference Context Memory Storage to handle massive KV caches without slowing down.
Huang pointed out how AI demand is exploding: models growing 10x in size yearly, inference needing more scaling, but costs per token dropping fast too. Companies are shifting budgets from old-school computing to AI, and Nvidia's betting big that integrated platforms like this will win out.
It's not just raw power; it's about efficiency and lower total cost of ownership for these AI factories.
Industry Reaction and the Road Ahead
Why does this matter so much? AI isn't just chatbots anymore. We're talking agentic AI that reasons step-by-step over long contexts, generates videos, solves multi-step problems—stuff that chews through tokens like crazy. Current setups struggle with memory and bandwidth as models hit trillions of parameters.

Vera Rubin tackles that head-on with better confidential computing (full encryption across the NVLink domain), improved reliability features, and that new Inference Context Memory Storage to handle massive KV caches without slowing down.
Huang pointed out how AI demand is exploding: models growing 10x in size yearly, inference needing more scaling, but costs per token dropping fast too. Companies are shifting budgets from old-school computing to AI, and Nvidia's betting big that integrated platforms like this will win out.
It's not just raw power; it's about efficiency and lower total cost of ownership for these AI factories.
A True Leap Forward in AI Hardware
If you're into tech, this is one of those moments where you feel the ground shifting. Blackwell was impressive, but Rubin looks like a genuine leap.
Can't wait to see what people build with it once it hits later this year.
The AI race keeps getting faster, and Nvidia just raised the bar again.
This platform could power the next big breakthroughs in agentic AI and beyond.

Michael Johnson
Tech Entrepreneur & Consultant
Michael Johnson is a tech entrepreneur and consultant, specializing in AI, blockchain, and digital transformation strategies. He helps tech companies build scalable solutions and often writes about the future of tech and innovation.
No Preview post

