Jensen Huang GTC Keynote Full Text: The Inference Era Has Arrived, At Least $1 Trillion Revenue By 2027, Lobster Is The New Operating System
On March 16, 2026, NVIDIA’s GTC 2026 conference commenced, and founder and CEO Jensen Huang delivered the keynote address.
Framed as the industry’s annual focal event, Huang’s remarks described NVIDIA’s evolution from a traditional chipmaker into a provider of AI infrastructure and factory‑scale systems. Confronting investor concerns about revenue sustainability and growth potential, he articulated the foundational commercial framework he expects to drive future expansion, which he termed “Token Factory Economics.”
Huang presented an optimistic revenue outlook, asserting that demand will reach at least $1 trillion by 2027. He recalled that a year earlier he had identified $500 billion of high‑confidence demand covering Blackwell and Rubin through 2026, and declared that current visibility extends to a minimum of $1 trillion for 2027. This projection briefly lifted NVIDIA’s share price by more than 4.3 percent. Huang further suggested that actual compute demand could exceed this estimate, arguing that NVIDIA systems represent the world’s lowest‑cost infrastructure and that their broad model compatibility will sustain long product lifecycles. He noted that approximately 60 percent of NVIDIA’s revenue is derived from the top five hyperscale cloud providers, with the remaining 40 percent distributed across sovereign cloud, enterprise, industrial, robotics and edge computing applications.
To justify the trillion‑dollar demand thesis, Huang reframed data centers as factories that produce tokens, the fundamental units of AI output. He emphasized that power capacity constrains each facility and that, within a fixed power envelope, the operator that achieves the highest tokens per watt will realize the lowest production cost. Huang described a tiered commercial structure for AI services, with pricing that varies by throughput and latency, and stressed that as models grow in size and context length, token generation rates decline, making throughput and generation speed direct determinants of near‑term revenue. He asserted that NVIDIA’s architecture enables very high throughput at the free tier while delivering up to a 35‑fold performance improvement at the highest inference tiers.
Under the physical constraints of power and bandwidth, Huang introduced Vera Rubin, NVIDIA’s most sophisticated AI compute system to date. He highlighted the system’s full liquid cooling and cable‑free design, noting that rack installation time has been reduced from two days to two hours. Through integrated hardware and software co‑design, Vera Rubin increased token generation from 22 million to 700 million within two years, a 350‑fold improvement that far outpaces contemporaneous gains from Moore’s Law. To address bandwidth limitations for ultra‑low‑latency inference, NVIDIA presented a heterogeneous inference approach incorporating Groq technology: Vera Rubin handles compute‑ and memory‑intensive pre‑fill stages, while Groq processors, with large on‑chip SRAM, manage latency‑sensitive decoding. Huang recommended that organizations focused on high throughput allocate capacity primarily to Vera Rubin, while dedicating a portion of infrastructure to Groq for high‑value, programming‑level token generation. He disclosed that Groq LP30 chips, manufactured by Samsung, are in mass production with shipments expected in the third quarter, and that the first Vera Rubin rack is operational on Microsoft Azure. Huang also unveiled Spectrum X, the first mass‑produced co‑packaged optics switch, and emphasized the need to expand copper cable, optical chip and CPO production capacity.
Beyond hardware, Huang devoted substantial attention to software and ecosystem shifts, particularly the rapid emergence of Agents. He characterized the open‑source project OpenClaw as an unprecedentedly popular initiative and described it as the operating system for Agent computing. Huang predicted that traditional SaaS companies will transition to Agent‑as‑a‑Service models and introduced NVIDIA’s enterprise NeMo Claw reference design, which incorporates policy engines and privacy routing to enable secure deployment of agents that access sensitive data and execute code.
Huang also outlined implications for the workforce, proposing that engineers will receive annual token budgets in addition to base salaries. He suggested that compensation packages in Silicon Valley will increasingly include token allocations—potentially amounting to roughly half of base pay—to drive productivity gains.
Concluding his address, Huang previewed the next‑generation Feynman architecture, which will enable horizontal scaling across copper and co‑packaged optics, and mentioned ongoing work on Vera Rubin Space‑1, a space‑deployed data center computer intended to extend AI compute beyond Earth.











