What autonomous vehicles reveal about the hidden infrastructure powering the next era of AI

Mark Venables

AI In Depth, AI for Enterprise, AI Hardware/Infrastructure, AI Solutions, Exclusives

Share this article

Autonomous vehicles are evolving from isolated software challenges into fully integrated systems powered by physical AI. As end-to-end neural networks, simulation environments and edge compute converge, autonomy becomes a proving ground for the broader infrastructure demands of intelligent machines in the physical world.

Autonomous vehicles are not new. Their promise has circulated for decades, bolstered by occasional bursts of optimism and hindered by layers of complexity. What is new, however, is their context. The emergence of physical AI, as defined by NVIDIA, reframes autonomy not as a standalone capability but as the natural extension of a maturing AI stack, one that connects data centres, simulation environments and edge compute in a closed-loop pipeline. That pipeline is no longer theoretical.

This is not about dashboards with chatbots. This is about machines that perceive, reason and act in the physical world, often with no human in the loop. From tightly synchronised logistics robots in automated warehouses to general-purpose humanoids and fully autonomous vehicles operating in urban environments, the concept of physical AI shifts autonomy from product features to infrastructure challenges. And, as panellists at NVIDIA GTC 2025 made clear, that challenge is neither small nor static.

From moonshots to mature systems

The evolution from AV 1.0 to AV 2.0 is more than a shift in software architecture. AV 1.0 systems relied on multiple discrete deep neural networks, each designed to perform a narrow task, identifying pedestrians, reading traffic signals, detecting lanes or tracking obstacles, with no shared context between them. AV 2.0 replaces this fragmented approach with unified, end-to-end neural networks that interpret raw sensor inputs, make driving decisions, and control the vehicle in a continuous learning loop. It represents a move from modular perception to full-stack autonomy, and with it, the passing of a certain kind of optimism, the belief that stacking together task-specific models would somehow lead to general intelligence on the road.

In the AV 1.0 era, said Aaraadhya Narra, Senior Product Manager for Autonomous Vehicles at NVIDIA, a single vehicle might have required up to 50 discrete neural networks just to navigate. “You had one network detecting stop signs, another for lane keeping, another for pedestrian tracking,” she explains. “Each had to be trained, tested and validated independently. There was no shared understanding of context or environment.”

By contrast, AV 2.0 uses end-to-end neural architectures that map directly from sensor input to control action. This holistic approach allows the vehicle to see and understand, predict and act with purpose. “Rather than connecting disconnected functions, we are creating unified systems that can interpret the world and coherently make decisions,” Narra adds.

This change in architecture also changes the demands on infrastructure. A level 2 vehicle with basic assisted driving features might require development clusters of 8,000 GPUs. A level 4 or 5 vehicle running a full AV 2.0 stack could require up to 80,000 GPUs during development. The scale is formidable, and the economics of autonomy depend heavily on managing it.

Simulation as a substitute for the world

Building real-world intelligence using real-world data is inefficient, expensive, and, in many cases, impossible. Rare or dangerous scenarios, such as an animal darting into traffic or a piece of debris falling onto the road, cannot be orchestrated safely in physical environments. In a global market, every new city or geography demands new training data, new validation routines, and new regulatory alignment.

This is where NVIDIA’s investment in simulation becomes central. The idea is not simply to test how a vehicle performs in a virtual environment. It is to replicate, with high fidelity, the exact sensor outputs a vehicle would experience in any conceivable condition, from snow-blind highways to congested inner-city streets. Camera artefacts, LiDAR distortions, and radar shadows can all be modelled and manipulated.

“We no longer simulate just the software,” Katie Washabaugh, Product Marketing Manager at NVIDIA, adds. “We simulate the entire physical environment. The textures, the light refraction, and the interactions between objects. If we want to test a vehicle in fog at dusk with light rain and a cyclist approaching from a blind spot, we can do that without ever leaving the lab.”

This level of simulation is powered by World Foundation Models, large-scale AI systems that encode an understanding of motion, force, trajectory and spatial causality. These models allow developers to begin with a base scenario and generate thousands of variations, exposing the AV stack to situations it might otherwise never encounter during real-world training. In that sense, simulation is no longer a validation tool. It is a core part of the training loop, feeding edge cases into the development cycle and accelerating the path to safety.

Edge intelligence and the road to real time

Training models in a data centre and simulating them in virtual worlds is only part of the story. Ultimately, the real challenge lies in deploying those models at the edge, in the vehicle itself, where every millisecond matters and every decision carries risk. At level 5 autonomy, the vehicle must operate with full independence. There is no fallback to the cloud. This demands inference performance at the edge and a level of redundancy and fail-safety that mirrors aerospace or defence systems.

“The car must reason, adapt and respond entirely on its own,” Narra explains. “Large language models, generative AI agents, even perception and control stacks, everything must run locally. There is no room for latency and no tolerance for error.” NVIDIA’s DRIVE AGX platform is designed to handle this load. It runs the full end-to-end AV stack in real time and supports additional applications beyond autonomy. Generative AI models provide contextual understanding, allowing the vehicle to better interpret ambiguous scenes or navigate uncertain environments. Integrated with LLMs, conversational agents connect drivers to the broader ecosystem, from dealership to diagnostics to entertainment.

These systems are not isolated features. They are part of a unified stack, built to scale with future demands. “We have developed an architecture that spans from the cloud to the car and back again,” Washabaugh notes. “This is not just about autonomy. It is about rethinking how vehicles operate, learn and evolve.”

Data gravity and continuous learning

The technical elegance of AV 2.0 belies a more mundane truth: data gravity is real. As models become more complex and simulation pipelines expand, the volume of data increases exponentially. For every hour of driving, terabytes of sensor data are generated, labelled, simulated, and fed back into the model training loop.

Data management is quickly becoming the limiting factor for companies developing autonomous vehicles. Without a robust data infrastructure spanning collection, curation, labelling, storage and retrieval, the promise of autonomy will remain trapped in the prototype. Here, too, NVIDIA is attempting to reduce friction. Its Cosmos World Foundation Models act as force multipliers, generating synthetic datasets that are both diverse and physically accurate. Its Omniverse platform bridges simulation and development, offering standardised APIs and tooling that integrate with existing engineering workflows. Its AI Enterprise software stack supports model versioning, deployment, and monitoring across the full lifecycle.

However, even the best tools cannot compensate for poor strategy. For executives considering AV deployments, whether in consumer vehicles, logistics fleets or industrial settings, the questions are no longer limited to performance. They are about scale, trust, and ROI. Is the infrastructure future-proof? Are the models auditable and transparent? Can the simulation data stand up to regulatory scrutiny? And can the edge compute platform support five years of feature expansion without hardware obsolescence?

Beyond the vehicle, toward physical AI

Perhaps the most important shift is the conceptual one. Autonomy is no longer the goal; it is a capability embedded in a broader paradigm. In the era of physical AI, the same infrastructure used to develop robotaxis can be applied to drones, factory robots, or agricultural machines. The tools are common, and the principles are transferable.

This is not a vision of universal robots but of a shared architecture. As Washabaugh put it, “Whether it is a robot in a warehouse, a humanoid in a hospital or a vehicle on a city street, the core elements remain the same,” she concludes. “You need perception, reasoning and action. You need simulation. You need scalable infrastructure. That is what we are building.”

The convergence of cloud, simulation, and edge compute will not eliminate complexity, but it may finally contain it. For senior decision-makers grappling with automation strategies, the opportunity lies not just in deploying AVs but in understanding how physical AI redefines the relationship between data, infrastructure, and action. Autonomy, in that sense, is no longer an endpoint. It is a test case for something much larger.