AI and the data centre revolution

Mark Venables

AI In Depth, AI Hardware/Infrastructure, Exclusives

Share this article

AI is transforming data centre infrastructure, pushing organisations to rethink scalability, cost management, and energy efficiency. Mark Venables explores how hyperconverged infrastructure, hybrid multi-cloud strategies, and automation are reshaping the way enterprises deploy and manage AI workloads.

AI is transforming business operations at an unprecedented pace, placing new demands on IT infrastructure. Traditional data centre models, designed for predictable enterprise workloads, are struggling to keep up with the scale and complexity of AI-driven computing. Scalability, cost management, and energy efficiency are at the forefront of every IT leader’s concerns. The rise of hyperconverged infrastructure (HCI) and hybrid multi-cloud strategies offers a path forward, simplifying operations while optimising resources for AI workloads.

James Sturrock, Director of Systems Engineering at Nutanix, has seen the challenges first-hand. “Data centres are under increasing pressure to accommodate high-performance workloads without excessive power consumption,” he says. “The shift towards AI and other high-compute applications means companies need to rethink their infrastructure strategies. Hyperconverged infrastructure helps address this by integrating compute and storage, removing unnecessary hardware layers, and reducing overall complexity. This leads to better resource utilisation, which in turn translates to lower operational costs and energy savings.”

Hyperconverged infrastructure is emerging as a critical enabler for AI, bringing together storage, compute, and networking into a single software-defined system. By consolidating these traditionally separate components, organisations can reduce their physical footprint, lower energy consumption, and create an infrastructure that is easier to scale. The flexibility of HCI also allows enterprises to deploy resources dynamically, ensuring that AI workloads can be executed efficiently without unnecessary over-provisioning.

Bridging on-premises and cloud for scalability

Hybrid multi-cloud strategies are gaining traction as organisations seek to balance control and flexibility. AI workloads are unpredictable, often requiring bursts of high compute power. Running these workloads entirely on-premises can be costly and inefficient but relying solely on the cloud presents its own challenges, from vendor lock-in to cost unpredictability.

“Our platform seamlessly integrates with AWS, Azure, and now Google Cloud, offering organisations the flexibility to distribute workloads intelligently,” Sturrock adds. “This enables dynamic workload placement—pushing bursty workloads to the cloud and then pulling the data back when computation is complete. By not running data centres at full capacity continuously, organisations can optimise energy usage, improve efficiency, and achieve significant cost savings.”

One of the biggest concerns for enterprises adopting AI is the cost of cloud infrastructure. Running AI models at scale demands vast computing resources, particularly GPU acceleration. In a native cloud environment, businesses often find themselves constrained by rigid instance sizing, leading to inefficiencies. Nutanix aims to address this by enabling a more flexible resource allocation model across hybrid environments, ensuring that AI workloads can be managed dynamically without incurring unnecessary costs.

AI is changing infrastructure demands

AI is not just another workload. It operates differently, with intensive computational needs, large-scale data processing, and a reliance on containerisation. Traditional enterprise IT environments, built around virtual machines, are not optimised for these new demands. “Many organisations know they need to adopt AI but are still figuring out their exact requirements,” Sturrock adds. “What is clear is that AI workloads are increasingly container-based. The industry is shifting away from monolithic virtual machines towards ephemeral workloads that can be spun up and down quickly on demand, ensuring maximum efficiency.”

AI workloads also require different storage architectures. Large language models and generative AI systems process vast amounts of unstructured data, relying heavily on object storage rather than traditional file-based systems. GPU acceleration is another critical factor, as AI inference and training require high-performance compute resources that go beyond standard CPU-based workloads.

Another major shift is the decentralisation of AI processing. While large-scale training happens in centralised data centres, inference and decision-making are increasingly occurring at the edge. Retailers, for example, are using AI for real-time customer interactions, while manufacturers deploy AI-driven quality control systems on production lines. The ability to manage AI workloads across core data centres, cloud, and edge locations is becoming a competitive necessity.

Managing complexity and ensuring security

Deploying AI at scale introduces a new layer of complexity in IT operations. AI infrastructure needs to be highly optimised, balancing computational performance, networking, and storage. Many organisations assume that adding GPUs to existing infrastructure is sufficient, but AI workloads demand a far more comprehensive approach.

“AI infrastructure is fundamentally different from traditional IT infrastructure,” Sturrock explains. “It is highly GPU-intensive because GPUs are optimised for mathematical computations required for AI inferencing. The surge in demand for GPUs has caused prices to skyrocket over the past year, making cost-effective infrastructure planning essential.”

Security is another key consideration. AI models require access to large datasets, often containing sensitive information. Ensuring compliance with data protection regulations while maintaining accessibility for AI training and deployment presents a significant challenge. Organisations must build robust data governance frameworks to protect against security breaches while enabling AI-driven innovation.

Automation is playing an increasing role in simplifying AI operations. With real-time monitoring and predictive analytics, AI-driven infrastructure management can dynamically allocate resources, optimise workloads, and proactively identify potential failures before they impact business operations. Automated workload placement ensures that AI applications run where they are most efficient, balancing cost, performance, and security requirements.

The future of AI infrastructure

The next phase of AI adoption will see further fragmentation of computing environments. As AI capabilities expand beyond the data centre, edge computing will play a greater role in real-time processing. Whether in autonomous vehicles, industrial automation, or smart cities, AI-powered applications will need infrastructure that is distributed, scalable, and secure.

“AI infrastructure will become increasingly fragmented, with compute resources spread across data centres, cloud environments, and edge locations,” Sturrock says. “A good example is the retail sector, where AI-driven edge computing is being explored for personalised advertising. Supermarkets, for instance, are experimenting with facial recognition to tailor promotions at petrol stations based on a customer’s purchase history. This type of AI deployment must happen at the edge, close to where the data is generated.”

Future advancements in AI will also drive a greater emphasis on infrastructure intelligence. AI will increasingly be embedded within IT management systems, helping organisations automate workload placement, predict resource requirements, and optimise performance. Rather than simply being a tool that requires infrastructure, AI will itself help to manage and refine IT operations.

Nutanix’s Project Beacon is an example of this future direction, aiming to provide a unified management platform that enables organisations to deploy, manage, and optimise workloads across all environments. By abstracting away infrastructure complexity, businesses can focus on AI innovation without getting bogged down in the intricacies of IT operations.

The rapid adoption of AI is forcing enterprises to rethink their infrastructure strategies. Hyperconverged systems, hybrid multi-cloud models, and AI-driven automation are no longer optional, they are essential. As organisations navigate this shift, the key will be balancing flexibility, cost, and performance to ensure AI delivers real business value.