When the network becomes the bottleneck of AI ambition

Share this article

AI investment is accelerating at extraordinary speed, yet the one layer assumed to “just scale” may prove the most fragile. As enterprises rush to deploy AI factories, hybrid estates and sovereign clouds, the network is quietly becoming the defining constraint on AI economics.

Artificial intelligence has triggered one of the most aggressive capital cycles in modern technology history. Enterprises are investing in accelerators, upgrading data centre capacity, redesigning power distribution and deploying liquid cooling at unprecedented scale. Yet in many executive conversations, the network is still framed as connective tissue rather than as critical infrastructure. That framing is becoming increasingly untenable as AI workloads expose structural limitations in architectures that were never designed for distributed, latency-sensitive intelligence at scale.

Roland Mestric, Strategic Marketing, IP and Data Center Networks at Nokia, believes the industry is approaching a decisive inflection point. “There is a tendency to assume that if you buy enough GPUs and build enough data centre capacity, performance will follow automatically,” he says. “But AI is fundamentally a distributed workload. It depends on constant, high-volume, low-latency communication between systems. If the network cannot sustain that, the entire economic model starts to break down.” In other words, AI ambition is no longer constrained primarily by silicon availability. It is constrained by the ability of networks to synchronise intelligence across increasingly complex estates.

The hidden economics of idle GPUs

For many enterprises, the first sign of trouble is not catastrophic failure but inefficiency. Training clusters appear provisioned correctly, capacity looks sufficient on paper, and yet performance does not meet expectations. Mestric argues that this is where the true cost of network underperformance becomes visible. “A GPU that is waiting for data is not just idle silicon,” he explains. “It is stranded capital. When you look at the investment per accelerator and multiply that across thousands of units, network inefficiency becomes one of the most expensive failure modes in AI infrastructure.”

The conversation around AI infrastructure often focuses on raw compute density, but east-west traffic patterns within training environments are now dominant. Data shuffles constantly between nodes, synchronisation traffic grows exponentially as model sizes expand, and microseconds of delay accumulate into meaningful increases in job completion time. “We are moving from thinking about bandwidth in isolation to thinking about workload efficiency,” Mestric says. “If the network introduces even small delays across a distributed training job, that compounds across the cluster. It directly impacts time to insight and cost per model.”

That shift reframes traditional performance metrics. Throughput and uptime remain important, but they no longer capture the business reality of AI deployments. Tokens per second, job completion time and deterministic latency are becoming economic indicators, not just technical ones. When model training overruns because of congestion or packet loss, the cost is borne in delayed innovation cycles and extended infrastructure amortisation. The network is therefore no longer peripheral to AI economics. It is embedded within them.

East-west congestion and the hybrid dilemma

The structural problem intensifies as enterprises adopt hybrid and sovereign AI architectures. AI estates are rarely confined to a single data centre. They span public cloud, private infrastructure and edge environments, often across national boundaries to satisfy regulatory or sovereignty requirements. “Many enterprise networks were built for north-south traffic and predictable application flows,” Mestric notes. “AI flips that dynamic. East-west traffic dominates, and workloads move between domains. Without automation and dynamic optimisation, the network becomes a bottleneck almost immediately.”

Inferencing at the edge makes these limitations more visible. Industrial facilities, logistics hubs and retail estates are deploying AI models closer to data sources to reduce latency and preserve privacy. Yet those edge nodes must synchronise with centralised systems for orchestration, retraining and analytics. The result is a topology that behaves less like a traditional enterprise network and more like a distributed control system. “Distributed inferencing turns the network into a control plane,” Mestric explains. “You are not just transporting data. You are coordinating intelligence across thousands of locations. That requires deterministic performance and deep visibility.”

Legacy WAN architectures struggle under this pattern. Static provisioning and manual configuration cannot adapt to traffic flows that fluctuate according to training cycles, inference demand and cross-domain synchronisation. Mestric is blunt about the operational implications. “Manual operations collapse under AI scale,” he says. “You need telemetry, predictive analytics and closed-loop control to adapt in real time. Otherwise, the network lags behind the workload.” Automation therefore shifts from being an operational efficiency tool to being a prerequisite for AI viability.

Optical layers and inter-data centre scale

As models grow larger and compliance requirements drive geographic distribution, inter-data centre connectivity becomes another constraint. Training increasingly spans multiple facilities for resilience, energy optimisation or regulatory separation. Massive data replication across those sites places unprecedented demands on optical transport networks. “AI is reshaping optical economics,” Mestric observes. “High-capacity, low-loss fabrics are becoming foundational. If the optical layer is not designed for AI scale, the IP layer cannot compensate.”

The physics of transmission cannot be abstracted away indefinitely. Speed-of-light latency, fibre dispersion and capacity ceilings introduce hard limits that must be engineered around rather than assumed away. In this context, automation extends beyond IP routing to the optical layer itself.

Dynamic wavelength provisioning, real-time path optimisation and deep cross-layer visibility become essential capabilities rather than advanced features. The ability to align IP and optical domains is no longer about operational elegance. It is about sustaining AI throughput at scale.

Mestric also challenges the assumption that efficiency gains will outpace demand. Hardware and software optimisation have delivered remarkable improvements, yet those gains are consumed almost instantly by more ambitious models and broader deployment. “Every time we deliver more efficiency, it is absorbed by increased usage,” he reflects. “The belief that efficiency alone will solve scaling challenges may prove optimistic. Demand is compounding faster than incremental improvements.” That dynamic reinforces the need for architectural redesign rather than incremental patching.

From infrastructure spend to strategic leverage

The most significant shift, however, may be conceptual. Networks have historically been treated as cost centres, necessary but rarely strategic. AI challenges that classification by tying network performance directly to revenue generation and competitive differentiation. “Network performance now dictates service viability,” Mestric says. “It is no longer back-office infrastructure. It is front-line business capability.” In sectors such as finance, manufacturing and telecommunications, inferencing latency can directly influence customer experience and operational outcomes.

This reframing elevates network governance to the executive level. Infrastructure decisions become intertwined with strategic AI investment. “When AI becomes core to your competitive advantage, the network becomes a board-level topic,” Mestric argues. “It determines how fast you can innovate and how reliably you can scale.” Enterprises that fail to internalise this reality risk undermining their own AI investments through architectural complacency.

The European dimension adds further complexity. Sovereignty requirements and regional AI initiatives demand distributed architectures underpinned by robust interconnectivity. “Sovereign AI initiatives depend on high-performance national and cross-border networks,” Mestric explains. “If the infrastructure cannot support that, strategic autonomy is compromised.” The network therefore intersects not only with enterprise strategy but with industrial policy and digital sovereignty.

AI has often been portrayed as a story of models, accelerators and algorithmic breakthrough. Yet as deployment matures, the invisible layers of infrastructure are emerging as decisive determinants of success. The next phase of AI will not be defined solely by larger models or denser racks. It will be defined by whether networks evolve fast enough to keep pace with ambition. As Mestric concludes, “We are at a moment where the network must evolve as fast as the models. If it does not, it becomes the bottleneck of AI ambition.”

Related Posts
Others have also viewed

Meta turns to custom silicon as agentic AI shifts the balance of compute

Meta has agreed to bring tens of millions of custom processor cores from Amazon Web ...

Autonomous systems move from ambition to infrastructure as enterprise AI takes control

A deepening partnership between ServiceNow and Google Cloud signals a shift in how artificial intelligence ...
Data Centre

Europe scales up AI factories as compute demand begins to outgrow traditional infrastructure

Nebius is planning a 310 MW AI facility in Lappeenranta, Finland, a development that reflects ...

Gigawatt scale AI infrastructure begins to redefine the limits of industrial development

Crusoe has announced plans to build a 900 megawatt AI data centre campus in Abilene, ...