Infrastructure decides the future of intelligence

Mark Venables

AI In Depth, AI Factories, AI Hardware/Infrastructure, Exclusives

Share this article

Artificial intelligence is advancing faster than the infrastructure designed to carry it. From silicon to space-based data centres, Google’s Amin Vahdat sets out why the next breakthroughs will be determined by power, velocity, and physical reality rather than model capability alone.

For most of the last two years, the public narrative around AI has been dominated by models. Bigger models. Smarter models. Faster models. The industry has been transfixed by leaderboard movements and release cycles, often treating infrastructure as a secondary concern, something that could be scaled up once the real breakthroughs had been achieved.

Amin Vahdat, Chief Technologist for AI Infrastructure at Google, sees the moment very differently. Sitting at the intersection of silicon design, distributed systems, data centre architecture, power delivery, and model deployment, Vahdat occupies a vantage point where abstraction collapses quickly into physical constraint. From that position, one thing is already clear. The future of AI will not be decided by models alone, but by how fast infrastructure can evolve to support them.

“Life is exciting,” he says, without exaggeration. “It’s never been more exciting. I think you can feel it everywhere. I’m just thrilled to be working on this right now.” That excitement is not rooted in novelty.

Google’s Gemini programme has now been running for more than three years, long enough for optimism to harden into operational reality. “Gemini three has been awesome,” Vahdat says. “We’re proud of where it is across essentially all the benchmarks. But what matters more to me is the journey. Gemini one came out two years ago, and for a long time we were the underdog. Internally, people believed, but externally there was scepticism. To come through that and be where we are now matters.”

What that journey revealed, he argues, is that AI progress is no longer about isolated breakthroughs. It is about compounding advantages across the stack.

Why full-stack integration matters more than models

Google is often described as one of the few truly full-stack AI companies, spanning custom silicon, distributed systems, data centres, networks, and global applications touching billions of users. According to Vahdat, that integration is not a branding exercise. It is the core mechanism through which progress compounds.

“The real secret weapon we have is not any single component,” he explains. “It’s the fact that we get to work together across the entire stack to solve the end problem. TPUs are not designed in isolation. They’re co-designed with DeepMind, with input from Search, Ads, YouTube, and other workloads. The infrastructure and the models are built hand in hand.”

That co-design process is not optional. Hardware and software decisions made today will not be realised at scale for years. “The software we’re building has two to three-year lead times. The hardware has two to three-year lead times,” Vahdat says. “So you have to predict the future. No one gets it exactly right, but if you understand the probability distribution of outcomes, you can evaluate designs relative to one another.”

This is where infrastructure becomes the limiting factor. “We wind up being the bottleneck,” he admits. “If we had more infrastructure capacity, it would be a big positive. The pace of model innovation is incredible, but it instantly consumes whatever efficiency we deliver.”

That dynamic undercuts a comforting industry assumption, that efficiency gains will naturally reduce cost and energy pressure over time. “Every efficiency improvement gets consumed almost immediately,” Vahdat says. “Capabilities expand, people do more with them, and the gains disappear.”

Specialisation is the real phase change

The most profound shift now underway, in Vahdat’s view, is the move away from general-purpose computing toward aggressive specialisation. The emergence of GPUs and TPUs is not a temporary response to scarcity. It is the beginning of a fundamentally different architectural era. “The real phase change is that we no longer need one-size-fits-all architectures,” he says. “We can specialise for individual workloads and even invent entirely new architectures, both in hardware and software.”

Google’s Tensor Processing Units (TPU) are offered exclusively through Google Cloud Platform, but Vahdat is careful to frame them as part of a broader ecosystem rather than a closed alternative. “NVIDIA GPUs are a huge part of the Google Cloud Platform (GCP),” he notes. “We work closely with Nvidia. The goal is not to push one technology, but to solve customer problems. Sometimes that’s a GPU. Sometimes it’s a TPU. Sometimes it’s something else.”

Specialisation delivers dramatic gains. “Across power, cost, and scale, you can get at least a factor of ten improvement,” Vahdat argues. “But you give up generality. You wouldn’t run a database on an accelerator optimised for AI inference.”

The trade-off becomes more acute when hardware development cycles stretch across multiple years. “The more specialised you want to be, the more accurately you need to predict where workloads are going,” he says. “Right now, from design to deployment, you’re looking at three years.”

The race to compress time

If there is one constraint Vahdat would remove, it is time. “If we could cut the lead time for hardware design and deployment by a factor of ten, it would change the world,” he says. The ambition sounds implausible even to him. “Three months would be radical. I don’t know how to do it,” he admits. “Two years seems achievable. Eighteen months starts to make people nervous. Twelve months feels impossible.”

Yet even incremental compression unlocks compounding benefits. “The more we pull in the cycle, the more we can specialise,” Vahdat explains. “The more we specialise, the more efficiency we deliver. Power efficiency alone could improve by an order of magnitude.”

That would not just reshape AI economics. It would force a rethink of depreciation, capacity planning, and infrastructure amortisation. “The idea that hardware must depreciate over five or six years is not a law of nature,” he says. “Those assumptions were set under very different conditions.”

Space as an infrastructure experiment

When conversation turns to radical thinking, space inevitably enters the discussion. Google, along with several other hyperscalers, is actively exploring orbital infrastructure concepts, not as science fiction, but as first-principles engineering exercises.

“From a first-principles perspective, space holds a lot of appeal,” Vahdat says. “Sun-synchronous orbit gives you continuous solar power. No cloud cover. No night cycle. You remove some of the hardest constraints we face on Earth.”

The theoretical gains are striking. “You’re potentially thirty per cent more efficient on power, thirty per cent more efficient on networking, and fifty per cent lower latency by avoiding fibre paths,” he notes. The obstacles, however, are formidable. “Cooling is a problem. Maintenance is a problem. Reliability is a problem,” he says. “The way we deploy and maintain infrastructure today does not scale to the levels we’re talking about.”

Robotics, he suggests, may be unavoidable. “At the scale we’re growing, whether terrestrial or orbital, human-centric maintenance models won’t work,” he continues. “We will have to rethink deployment and repair entirely. A gigawatt-scale space data centre is probably more than five years away. But it’s worth investing in now because we’ll learn a tremendous amount, even if the final form looks very different.”

Efficiency will not save us yet

One of the most deeply held beliefs in AI today is that efficiency improvements will eventually stabilise energy demand. Vahdat is sceptical. “The rate of efficiency improvement is extraordinary,” he acknowledges. “I’ve never seen anything like it. But every gain is immediately consumed by new capabilities. Agents, orchestration, coding, planning, all of it expands to fill the available headroom.”

He compares the moment to the early days of Moore’s Law. “It feels like CPUs in their heyday,” he says. “Every release made everything twice as good. But now that pace feels even faster. Every three to six months, models are meaningfully better.”

The danger lies in assuming that progress naturally converges. “Efficiency will win eventually,” Vahdat says. “But that point may be further out than people expect.”

As models become more capable, a familiar question resurfaces. Will AI generate genuinely original insights or simply recombine existing knowledge more efficiently. For Vahdat, the distinction misses the real impact. “As an academic, originality mattered enormously,” he reflects. “But most progress comes from combining ideas in the right way, not inventing something from nothing.”

What AI delivers today is not genius, but leverage. “I can ask advanced questions across fields I was never trained in and get meaningful answers in seconds,” he says. “That’s not original insight in a philosophical sense, but it’s a game changer.” The implication is profound. “Even for people who already have access to experts, AI collapses time and cost,” Vahdat explains. “For everyone else, it opens doors that were previously closed.”

This is why he believes research and learning remain the most underappreciated AI use cases. “We talk endlessly about coding and automation,” he says. “But learning is the most universal activity in humanity, and AI is transforming it.”

The long-term vision Vahdat returns to is deeply human. “We have the opportunity to deliver a teacher for every learner and a doctor for every patient,” he says. “Not generically but personalised to individual needs.” Personalisation, he argues, is the final missing layer. “The models already know how to answer questions,” he says. “What’s coming is understanding how different people absorb information and adapting to that in real time.”

That future is closer than it appears. “We’re already seeing hints of it,” Vahdat says. “If we can do it for business intelligence and healthcare, we can do it for education.”

Writing the infrastructure playbook

Asked what advice he would give to industry leaders, Vahdat resists hype. “This is not a winner-takes-all moment,” he says. “It’s bigger than any single company.” Instead, he frames the opportunity as generational. “This is the biggest technological shift since the internet, and likely much bigger,” he says. “There has never been a better time to be working in technology.”

For infrastructure teams in particular, the moment is defining. “We are writing the playbook for how intelligence is delivered at scale,” Vahdat concludes. “The services, the agents, the systems that come next will depend entirely on the foundations we build now.” If models define what AI can do, infrastructure will decide how far it can go.