Why heat is becoming the hard limit of AI

Share this article

AI infrastructure is scaling at a pace few anticipated, yet the industry conversation remains dominated by chips and power. Cooling, long treated as an engineering afterthought, is now emerging as a strategic constraint that will determine how far artificial intelligence can realistically advance.

For decades, cooling sat quietly in the background of data centre design, rarely discussed outside engineering teams. Executives focused on compute density, software capability, and energy procurement, assuming thermal management would evolve incrementally in step with hardware. That separation no longer exists. As AI workloads push silicon to unprecedented power densities, heat has shifted from a secondary efficiency consideration to a primary factor shaping performance, reliability, and long-term scalability. Cooling is no longer a background system supporting AI, it is now one of the forces defining what AI infrastructure can realistically support.

The pace of this shift has been underestimated beyond specialist circles. Chips that dissipated a few hundred watts only a handful of years ago are now approaching power levels that fundamentally change the physics of infrastructure design. Silicon roadmaps have advanced aggressively, but assumptions about how heat can be removed have not always kept pace with that acceleration. As a result, cooling is no longer something that can be addressed after strategic decisions are made. It now sits directly on the critical path of AI deployment, influencing everything from server architecture to facility economics.

That disconnect between expectation and reality is visible across the industry, particularly at executive level. According to Shahar Belkin, Chief Evangelist at ZutaCore, many organisations still treat cooling as a solvable operational detail rather than a limiting factor that can stall entire technology roadmaps if ignored.

“It is an underestimated bottleneck unless you are already seeing it from the silicon side,” Belkin says. “When you cannot cool the next generation of chips, cooling suddenly becomes very real. At the moment it is still solvable, but it needs serious attention. You cannot just add more air conditioning and expect it to work.”

Why heat has moved into the boardroom

What has changed most fundamentally is not the existence of heat, but its strategic relevance. Cooling decisions are no longer independent of decisions about hardware selection, rack density, or facility layout. Once AI workloads enter the picture, these choices become tightly coupled, and mistakes made early are difficult or impossible to reverse later. Treating cooling as an engineering problem to be solved after the fact now introduces material business risk.

“The chips themselves have changed dramatically,” Belkin says. “Not long ago we were talking about one hundred or one hundred and fifty watts per chip. Now we are at fifteen hundred watts per chip, with extremely dense heat flux. The ability to remove that heat using air simply does not exist anymore.” That shift removes the tolerance that traditional data centre design relied upon for decades. Air cooling worked because it allowed inefficiency, redundancy, and over-engineering. AI-class silicon removes that margin almost entirely, forcing a rethinking of how heat is handled at the architectural level.

As a result, cooling strategy now directly influences whether future AI hardware can even be deployed, not just how efficiently it operates once installed. “When air cannot cope, you must move to direct on-chip liquid cooling,” Belkin says. “That has to be designed into the server and into the data centre from the start. If you do not do that, you will not be able to work your way out later.” For senior leaders, this means cooling can no longer be delegated as a downstream concern. It has become a board-level decision tied to capital planning, performance risk, and long-term competitiveness.

Why liquid cooling alone will hit a ceiling

The industry’s rapid pivot from air to liquid cooling has addressed immediate constraints, but it risks repeating the same mistake at a higher power level. Liquid flow cooling, which absorbs heat into a circulating liquid, remains effective for today’s chips but is already approaching physical limits as power densities continue to rise. What looks sufficient now may quickly become inadequate as silicon roadmaps advance.

“Liquid flow cooling absorbs heat into the liquid, just like water cooling,” Belkin explains. “That works up to a certain chip power. But as power goes up, you need more flow. When you reach two thousand or even twenty-five hundred watts per chip, the amount of liquid you need becomes extremely high.” At that point, complexity returns through another door. Systems require higher pressures, larger pumps, and more intricate plumbing, introducing inefficiency and fragility rather than resilience.

This is where two-phase cooling fundamentally changes the equation. “With two-phase cooling, you use the latent heat of boiling,” Belkin says. “You are not absorbing heat into the liquid. You are changing liquid into vapour. It is like boiling water in a pot. No matter how strong the heat source is, the temperature of the boiling liquid stays the same.” That property becomes decisive as power densities continue to rise, because it decouples heat removal from flow volume. “With pool boiling, every bit of power turns liquid into vapour and carries the heat away,” he adds. “It is almost an unlimited physical mechanism, and that is exactly what is needed for the next generation of chips.”

Efficiency, stability, and the economics of control

As cooling becomes a strategic constraint, efficiency takes on a different meaning from traditional data centre thinking. The objective is no longer to make chips as cold as possible, but to operate them as close as possible to their optimal thermal envelope without sacrificing stability. Overcooling may appear safe, but it introduces unnecessary cost, complexity, and energy waste that quickly becomes material at scale.

“The control is not about how cold you can make the chip,” Belkin says. “You can always make something very cold, but that is extremely inefficient. What matters is running the chip at the highest temperature where it still performs reliably.” That distinction reshapes the economics of AI infrastructure, particularly when facilities move beyond pilot deployments into sustained, high-density production environments.

Precise thermal control enables a different relationship with the surrounding environment. Instead of relying on power-hungry chillers or evaporative systems, data centres can increasingly operate using ambient conditions. “If you can control temperature accurately, you can use the ambient temperature outside the facility,” Belkin explains. “You do not need high-power chillers or evaporative cooling towers that consume large amounts of water. You can rely on dry coolers, which are essentially large radiators.”

Two-phase systems make that approach viable by regulating vapour pressure, which in turn defines the boiling temperature and therefore the operating temperature of the chip itself. That stability allows infrastructure to run closer to thermal limits without compromising reliability or performance. “The difference in cost is dramatic,” Belkin says. “Instead of spending one dollar on cooling for every dollar of computing, you can reduce that to six or seven cents. In a one hundred megawatt data centre, that completely changes the economics.”

From waste heat to usable value

Two-phase cooling also forces a reappraisal of heat itself. Traditionally, heat has been treated as an unavoidable by-product to be expelled as quickly as possible, often at significant cost. In a two-phase system, heat is no longer just waste. It becomes a recoverable, transportable resource that can be reused elsewhere.

“When you convert heat into vapour, you suddenly have something valuable,” Belkin says. “This is not warm water. It is high-grade vapour that keeps its temperature over distance.” That distinction matters because vapour can be moved without rapid thermal loss, opening up opportunities that are impractical with conventional liquid cooling.

“You can generate electricity from it, heat other parts of the building, or supply neighbouring facilities,” he says. “You can also use it for absorption cooling, where the heat itself helps cool other parts of the data centre.” In each case, energy that would otherwise be discarded is redeployed to reduce overall consumption and operating costs.

The financial implications are immediate rather than theoretical. “When you reclaim heat, you are saving real money at the bottom line,” Belkin says. “You are reusing energy you already paid for instead of throwing it away.” Sustainability benefits follow directly from that efficiency rather than being layered on as an afterthought. “Because the system is stable, you can rely on dry coolers in most locations,” he adds. “You eliminate chillers, evaporative systems, and water use entirely. That fundamentally changes the environmental footprint of data centres.”

Deployment without disruption

One of the persistent concerns surrounding advanced cooling technologies is the fear of disruption. Operators worry that adopting new systems will require wholesale reconstruction of existing facilities, forcing downtime or stranded assets. In practice, two-phase cooling can be introduced incrementally, reducing both technical and financial risk.

“You do not need to change your entire environment,” Belkin says. “You can deploy rack by rack. You can start by condensing vapour using air inside the data centre, and only later connect to facility water if needed.” This staged approach allows operators to introduce high-density AI workloads without committing to immediate, large-scale infrastructure changes.

That flexibility often comes as a surprise. “People assume it will be very complex because it involves phase change,” Belkin says. “When they see it installed, they realise it is actually very simple. Two pipes, a manifold, and the system just works.” The simplicity is not accidental. It reflects the fact that the complexity has been absorbed into the physics of the system rather than into layers of mechanical and electrical infrastructure.

Operational benefits extend beyond energy efficiency. “One thing people notice immediately is the reduction in noise,” Belkin says. “Another is that chips stop throttling. You can run at boost mode continuously without overheating.” For applications such as AI training or high-frequency workloads, that stability translates directly into predictable performance and better utilisation of expensive hardware.

From hyperscale to the edge and beyond

As AI workloads spread beyond hyperscale campuses into modular and edge environments, cooling constraints become even more pronounced. Smaller sites often lack the space, power overhead, or environmental control required for traditional cooling approaches. Two-phase systems, by contrast, scale down as effectively as they scale up.

“Two-phase cooling works extremely well at the edge,” Belkin says. “You can cool both the chips and the surrounding space without separate air conditioning systems. It fits constrained locations far better than traditional approaches.” That capability becomes increasingly important as AI inference and real-time processing move closer to where data is generated.

Looking further ahead, Belkin expects cooling to evolve from a hardware procurement decision into a service model aligned with outcomes rather than equipment. The pace of change in silicon is already forcing infrastructure planners to rethink long-term ownership models. “Customers do not care about cooling equipment,” he says. “They care about their chips working reliably and efficiently.”

That shift mirrors changes seen elsewhere in enterprise infrastructure. “If someone can guarantee that your chips will keep working, reduce your power consumption, and even help reuse heat, that becomes a service,” Belkin says. “Cooling will move in the same direction as software.” In a world where chip power densities continue to rise, that evolution may be less about convenience than necessity.

The stakes are existential for the next phase of AI. “If cooling does not move at the speed of silicon, the industry will stall,” Belkin says. “You cannot keep increasing power density if you cannot remove the heat.” In that sense, AI’s future will not be limited by algorithms or ambition alone, but by whether the industry finally treats heat as the strategic constraint it has become.

Related Posts
Others have also viewed

How intelligent power management helps CIOs do more with less in the race to net zero

As organisations race towards net zero while grappling with rising energy demand, power management is ...

AI traffic is pushing fibre networks beyond their physical limits

As artificial intelligence workloads scale across cloud platforms, data centres and edge environments, pressure is ...

Sovereign AI shifts from policy ambition to infrastructure reality

The debate around sovereign artificial intelligence is increasingly moving out of policy documents and into ...

Elite sport becomes a testbed for how AI understands the human body

Artificial intelligence is increasingly being judged not by what it predicts on screens, but by ...