Welcome to the Agentic Gemini Era: Inside Google’s Massive AI Momentum at I/O 2026

The tech world shifted its gaze to Mountain View this week for Google I/O 2026, and the message from Alphabet CEO Sundar Pichai was crystal clear: Google is no longer just adding AI features to its products, it is entering a fully realized, multi-platform “Agentic AI Era.” Reflecting on the staggering scale of adoption over the past year, Pichai showcased unprecedented product momentum, massive infrastructure investments, and a rapid transition toward autonomous AI agents that can think, reason, and act on behalf of users.

Here is a breakdown of the monumental growth metrics and infrastructure shifts defining Google’s AI ecosystem today.

The Token Explosions: Enterprise Demand Reaches New Heights

The sheer volume of data being processed underpins the immense industry demand for Google’s AI. According to Pichai, Google’s model APIs are currently processing an astonishing 19 billion tokens per minute.

This isn’t just an experimental phase for enterprises; it is core operational infrastructure. Over the past 12 months, more than 375 Google Cloud customers have each processed over one trillion tokens. The internal developer momentum is equally explosive: Google went from processing half a trillion tokens a day across its internal AI developer tools in March to more than three trillion tokens a day by mid-May—effectively doubling its capacity every few weeks.

Search and Gemini App: Breaking Billion-User Thresholds

Google’s consumer-facing AI products are seeing exponential viral growth, cementing their place in everyday digital life:

AI Overviews & AI Mode: Google Search underwent its biggest upgrade in over 25 years. AI Overviews has officially reached over 2.5 billion monthly active users. Meanwhile, the newly deployed AI Mode has crossed the 1 billion monthly active user milestone in just its first year. Pichai noted that when users interact with these advanced AI-powered Search features, their overall engagement with Google Search actually increases.
The Gemini App: At last year’s I/O, the Gemini app boasted a healthy 400 million monthly active users. Today, that number has more than doubled, surpassing 900 million monthly active users. More impressively, the volume of daily requests has grown over seven times (7x) in the exact same timeframe, fueled by hyper-personalized features like Personal Intelligence.

The Custom Silicon Bet: $190 Billion in Capex

Scaling to meet this level of demand requires an unfathomable amount of computing power. Pichai revealed that while Google’s capital expenditures (Capex) stood at $31 billion annually in 2022, the company expects to spend between $180 billion and $190 billion this year alone, a nearly six-fold increase in just four years.

A cornerstone of this investment lies in Google’s custom silicon. Exactly a decade after Google introduced its first-ever Tensor Processing Unit (TPU) on the I/O stage, the company is now scaling out its newly announced 8th-generation TPUs. This robust hardware foundation allows Google to train and run its next-generation models at a scale no competitor can easily match.

Enter Gemini 3.5 Flash: Frontier Intelligence Built for Action

The highlight of Google’s architectural breakthroughs this year is the introduction of Gemini 3.5 Flash, the first model in a new series designed to combine frontier-level intelligence with high-speed execution.

Pichai highlighted two crucial breakthroughs for Gemini 3.5 Flash:

Benchmark Superiority: Compared to Gemini 3.1 Pro, the 3.5 Flash model shows improvements across nearly all benchmarks, exhibiting a massive leap in coding capabilities and GDPVal (a metric capturing real-world, economically valuable tasks).
Speed-to-Intelligence Ratio: Positioned in a league of its own in the top-right quadrant of performance metrics, Gemini 3.5 Flash delivers frontier-grade intelligence while operating at four times (4x) the output token speed of competing models. For large-scale enterprise deployments, shifting workloads to Gemini Flash infrastructure is already projected to save some corporate clients over a billion dollars annually.

The Big Picture: From Assistants to Agents

What do all these numbers mean for the end-user? The massive momentum highlighted at I/O 2026 proves that Google has successfully transitioned from the “chatbot” phase into proactive computing. With the integration of Gemini 3.5 Flash, an upgraded conversational Search, and a cloud infrastructure operating at a multi-trillion token scale, Google is building a unified web ecosystem where AI agents don’t just answer your questions, they get things done.

Details here: https://blog.google/innovation-and-ai/sundar-pichai-io-2026/#momentum