The future of AI depends as much on what connects data centres as what runs inside them. Cloud evolution cannot outpace network design, and ignoring this truth risks bottlenecks, latency and failure across mission-critical workloads. This article is based on insights from Nokia’s white paper Network the Cloud: The Critical Role of the Network in Cloud Evolution, which explores why networking must be treated as a first-class component of AI infrastructure.
The conversation around AI infrastructure has become deeply one-sided. Compute dominates headlines. Power and cooling attract concern. Sovereignty commands attention. But the network, the critical substrate upon which every AI workload ultimately depends, remains largely invisible. It is rarely prioritised, poorly understood, and dangerously under-engineered relative to the demands now being placed upon it.
That might have been sustainable when the cloud was centralised. But it is no longer a world of hyperscale hubs processing static workloads. Cloud has become distributed, dynamic, and application specific. Enterprise workloads have been repatriated or pushed to the edge. AI has introduced models that require real-time inferencing in highly latency-sensitive environments. National governments have begun mandating local data processing for reasons of security and competitiveness. The entire architecture is fracturing into a high-performance mesh of data centres, central, regional, metro, edge and on-premises, operating under diverse ownership models and variable demand.
That evolution cannot happen without rethinking the network. It is no longer the silent enabler of cloud; it is its most critical constraint. The cloud continuum demands end-to-end consistency, regardless of where processing takes place. Whether training large models in an AI factory or serving personalised responses at the edge, latency, bandwidth, security, and orchestration must be guaranteed. That consistency cannot be delivered by architectures built for static client-server models. It requires an intelligent, integrated networking infrastructure capable of scaling horizontally, adapting instantly, and making routing decisions that align with business intent.
Latency is not a technical detail; it is an economic one
The value of AI increasingly lies not in how fast it can learn, but in how quickly it can respond. Inferencing has shifted from a back-office process to a front-line capability. In manufacturing, AI must flag defects in real time. In financial services, risk must be evaluated within milliseconds. In healthcare, diagnostic recommendations must be made without delay. In transportation, decisions must be made faster than any human can react.
In each of these scenarios, latency is not just a matter of experience; it is a matter of cost, safety, trust, and regulatory compliance. Traditional data centre architectures, which rely on backhauling traffic to large, centralised compute facilities, are simply not equipped to deliver the sub-10 millisecond response times these applications demand. Edge inferencing is now a requirement, not a luxury.
This introduces profound changes in network design. Not only must AI workloads be deployed closer to the user or device, but the interconnects between these decentralised locations must be fast, intelligent, and lossless. The days of over-provisioning bandwidth as a catch-all solution are over. What is needed now is architectural precision.
The dominant traffic pattern has shifted from north-south (user to cloud) to east-west (service to service). Inside the data centre, inference and training workloads generate massive amounts of lateral traffic between GPUs and memory systems. Between data centres, split inference and collaborative compute mean that workloads must be distributed and recombined in real time. The result is a network environment under continuous and growing strain, one that traditional technologies are ill-suited to handle.
The new rules of data centre performance
Modern AI data centres, often called AI factories, bear little resemblance to their predecessors. Where previous generations focused on storage and transactional compute, AI facilities are built around extremely dense GPU clusters, designed to operate as tightly coupled supercomputers. The performance of these systems is determined not just by processing speed, but by how efficiently data moves between nodes.
Every microsecond matters. GPUs must exchange parameters, synchronise memory states, and share gradient data at extreme speeds. Any congestion, delay or packet loss has a direct impact on job completion time. Worse, it can require restarting training runs, wasting energy, compute cycles and potentially days of progress.
Within these environments, new networking models are emerging. Rail-optimised designs reduce hops between devices to the absolute minimum. Load balancing systems use per-packet distribution to ensure even utilisation. High-performance fabrics incorporate congestion management at a hardware level. Legacy CLI-based configuration is giving way to model-driven orchestration using protocols such as YANG and EVPN over VXLAN.
But even these innovations are insufficient unless paired with broader changes across the data centre interconnect. Training a model in one location and serving it in another requires fast, secure and deterministic paths across the WAN. That means integrating optical transport and IP routing into a coherent fabric, capable of scaling terabit flows while maintaining performance guarantees. Segment routing, software-defined overlays, and real-time telemetry are not optional add-ons. They are prerequisites.
From centralisation to continuum
AI is not only changing where workloads run, but how and why they move. The classical model of training in a central cloud and deploying to a fixed location is being replaced by a fluid, usage-aware approach to deployment. Enterprises increasingly use split inference, where lightweight tasks are processed on edge devices and more complex reasoning is routed to metro or core nodes. Workloads migrate dynamically based on compute availability, power pricing, data privacy constraints, and real-time performance feedback.
This flexibility is only viable if the network can act as a real-time arbitrator, evaluating where a workload should run, moving data accordingly, and guaranteeing that the shift does not disrupt performance or compliance. It is a radically more complex model than simply routing traffic to a cloud endpoint.
Consider the implications of cloud arbitrage at scale. If enterprises begin routing AI workloads to the lowest-cost node at any given moment, based not only on infrastructure pricing but on power availability, latency and environmental conditions, the network becomes the marketplace. And marketplaces cannot function without transparency, trust and speed.
Edge nodes may become preferred for reasons of sovereignty or energy efficiency. But unless networks can deliver high-speed interconnects between these nodes, the advantage is lost. The result is not only a performance hit, but also an economic one. Idle GPUs waste money. Delayed inference reduces operational effectiveness. Inconsistent latency damages user trust.
Security and sovereignty cannot be bolted on
As AI applications extend into sectors such as defence, healthcare, finance, and public infrastructure, the network must also carry the burden of security and policy enforcement. These are not secondary concerns. In many cases, the legal and reputational risk of data leakage, model corruption or operational sabotage is greater than the cost of performance degradation.
AI workloads are inherently data intensive. They often involve the ingestion, processing and transformation of sensitive, proprietary, or regulated data. Whether that data is medical imagery, financial records, or industrial telemetry, its movement across the network must be tightly controlled.
That means enforcing data localisation rules, encrypting data in transit, ensuring path determinism, and monitoring for anomalies in traffic flow that could indicate compromise. Traditional perimeter-based security models are inadequate. Security must be embedded in every layer of the stack, at the application layer, the transport layer, and the infrastructure layer.
Sovereignty adds an additional layer of complexity. Enterprises and governments increasingly demand that AI models and data stay within national boundaries. This cannot be achieved through cloud selection alone. It requires network-level enforcement, including routing constraints, traffic inspection, and policy-aware orchestration. Again, the network becomes the gatekeeper, not just for performance, but for compliance.
Automation is not optional; it is foundational
None of this can be delivered through manual operations. The scale, complexity and speed of AI-centric infrastructure demand networks that can configure, heal, adapt and optimise themselves.
This is not simply about adopting SDN or deploying a controller. It is about re-architecting networks to be programmable, observable and autonomous. Operators must have complete visibility into flow-level telemetry. Network elements must expose APIs for control and feedback. Policies must be declarative, not imperative. And remediation must be closed loop, driven by intent rather than scripts.
More importantly, the network itself must become intelligent. AI workloads need to be matched by AI infrastructure, networks that can understand what the application is doing, what it requires, and how best to serve it. This is not speculative. It is already happening.
Forward-thinking operators are deploying systems that predict congestion before it occurs, reroute traffic dynamically based on model sensitivity, and prioritise inference over background traffic. Others are embedding AI into the network fabric itself, using ML models to optimise buffer sizes, schedule traffic bursts, and detect anomalous behaviour in real time. This is the future of the cloud, and it is one in which the network is not just an enabler but a competitive advantage.
The network is the new frontier of AI readiness
Enterprises investing in AI cannot afford to treat the network as an afterthought. It is no longer acceptable to invest in training models, building GPU clusters, and deploying intelligent services without addressing the question of how those systems connect.
AI infrastructure is not complete without an AI-ready network, one that spans data centre interiors, interconnects facilities, routes across wide-area links, and reaches users and devices with precision and reliability. This is not an incremental upgrade. It is a foundational shift. And it must be treated with the same urgency and seriousness as the rest of the AI value chain.
Failure to do so will not just limit performance. It will create systemic failure points in systems that enterprises are increasingly relying on to run their operations, serve their customers, and drive competitive differentiation. The cloud is evolving fast. The AI wave is reshaping everything it touches. The only question that matters now is whether the network is ready. Because without it, none of it works.
You can download Nokia’s white paper Network the Cloud: The Critical Role of the Network in Cloud Evolution at https://www.nokia.com/asset/214961/



