The future of financial markets is accelerated compute

Mark Venables

AI In Depth, AI for Enterprise, AI Solutions, Exclusives

Share this article

As AI reshapes the foundations of trading, Nasdaq is deploying accelerated compute to redefine performance at the core of the financial system. Heterogeneous infrastructure is no longer an experiment but a prerequisite for trust, transparency, and real-time intelligence.

In the architecture of modern finance, latency is no longer a tolerable inefficiency. It is a liability. Every microsecond shaved from an execution path tightens spreads, stabilises volatility, and reinforces trust in the fairness of the market. The challenge for infrastructure leaders like Nasdaq is that conventional compute models have plateaued. The next wave of performance gains will not come from faster CPUs alone, but from a new paradigm of heterogeneous compute. As AI transforms capital markets, the need for integrated, accelerated infrastructure is no longer theoretical. It is operational.

From electronic trading to AI-native infrastructure

Nasdaq has a history of firsts. In 1971, it became the world’s first electronic stock exchange. Today, it operates 18 markets and powers over 130 more with its technology stack. But this legacy is not its focus. It is an evolving benchmark. As Nikolai Larbalestier, Senior Vice President of Enterprise Architecture and Performance Engineering, explains, Nasdaq’s ambition is to be “the trusted fabric of the financial system”, and that requires a relentless push toward real-time integrity at a global scale.

This ambition now converges with the capabilities of accelerated compute. “What we are seeing with AI and accelerated infrastructure is a foundational shift,” Larbalestier says. “It parallels what Nasdaq did at its founding, moving markets from manual to electronic, but on a new axis: from electronic to intelligent.” According to Larbalestier, this shift is essential not just for performance but for trust. “Trust comes from determinism. And determinism at today’s scale cannot be achieved without embracing a heterogeneous compute model that includes GPU, FPGA, CPU, and cloud elasticity.”

Latency is law in the capital markets

Nasdaq’s trading systems already handle more than 140 billion messages in a single day across its US markets, with consistent throughput above four million messages per second. The infrastructure must be resilient, but it must also be aggressively efficient. Determinism, the consistent delivery of system responses within tightly bounded timing thresholds, is not a performance metric; it is a regulatory and reputational necessity.

“Today, we are operating around 20 microseconds from order to acknowledgement,” Larbalestier notes. “And that is round-trip, including matching and response. As volumes rise, those expectations do not flex; they harden.” To maintain this precision at increasing scales, Nasdaq’s infrastructure evolution has centred on three priorities: reducing latency, increasing determinism, and enabling dynamic response to volatility. Accelerated compute sits at the intersection of all three.

Grace and Grace Hopper redefine what is possible

In its recent evaluation of NVIDIA’s Grace CPU and Grace Hopper Superchip, Nasdaq tested whether these platforms could deliver the latency and throughput required for real-time matching in one of the most demanding workloads in enterprise IT. The results were not just promising; they were defining.

Grace, with its 72-core ARM-based architecture and monolithic design, demonstrated performance equal to or exceeding contemporary x86 CPUs in transaction throughput. “What was particularly compelling,” Larbalestier says, “was how Grace not only matched x86 for throughput but did so with consistency. That determinism is essential for fairness in our markets.”

More critical still was the performance of Grace Hopper. This CPU-GPU hybrid architecture, interconnected by NVIDIA’s 900GB/s NVLink, enabled a level of parallelisation and memory coordination previously unavailable in trading environments. By running persistent GPU kernels, code that remains resident on the GPU and continuously polls for data, Nasdaq was able to accelerate core workloads such as Black-Scholes pricing simulations, reducing kernel execution time by up to 30 per cent.

“The Grace Hopper changes the nature of compute coordination,” Larbalestier explains. “By sharing pageable memory between CPU and GPU and supporting atomic operations, we can eliminate synchronisation overhead. That means fewer delays, tighter feedback loops, and more intelligent matching behaviour.”

AI-enhanced matching and real-time digital twins

Nasdaq’s interest in accelerated compute is not limited to throughput tests and simulation benchmarks. It is directly tied to how capital markets will operate in an AI-native world. With platforms like NVIDIA NIM (NVIDIA Inference Microservices), Nasdaq is testing large language models deployed within GPU environments to deliver inference on the edge of the trading system. This could power AI-enhanced order types, risk assessments, or market surveillance alerts, all within the latency budget of the trading core.

One compelling application is the construction of digital twins in the market. These real-time simulations mirror live market behaviour to test the impacts of order types, volatility events, or new regulations before they occur. These models require the same infrastructure constraints as the live market, including ultra-low latency and deterministic compute, making platforms like Grace Hopper not merely ideal, but necessary.

“The idea is to have a continuously learning, continuously synchronised replica of the market that informs decisions and tests policy,” Larbalestier says. “This is not a back-office simulation. It is a live, real-time co-pilot for exchange operations.”

A new age of compute, a new class of trust

In this new operational paradigm, compute is no longer a support function. It is an embedded feature of market logic. Whether optimising complex order calculations in options markets or generating theoretical pricing data feeds for illiquid instruments, GPU-accelerated systems enable the kinds of dynamic analysis and response that human traders and static rule-based engines cannot.

One particularly high-potential area is the atomic matching of multi-leg orders in options markets. These involve sets of interdependent instruments that must be executed simultaneously and atomically to avoid exposure. Traditional CPU systems struggle to model and scale these at the same level. GPU-powered systems, with their parallelism and low-latency memory access, open new frontiers for real-time validation.

According to Larbalestier, this is part of a broader strategic shift. “We are no longer thinking in terms of static systems with occasional upgrades. We are thinking about compute infrastructure as a living part of the market fabric. That requires us to evaluate every layer, from silicon to protocol, and optimise for a world where AI is not just analysing markets, but participating in them.”

What this means for enterprise AI leaders

The lessons from Nasdaq’s journey are not limited to financial services. They apply to any domain where AI is being embedded into operational cores with real-time constraints. It is a case study of how to adopt accelerated infrastructure without compromising on determinism, trust, or system transparency.

The first lesson is that latency is an architectural issue. It is not solely a function of software tuning. Nasdaq’s embrace of FPGA offloads, kernel bypass networking, and persistent GPU kernels demonstrates the level of hardware-software co-design needed to deliver real-time AI performance.

The second is that AI infrastructure must be heterogeneous. No single processor type can deliver optimal performance across all dimensions. By adopting ARM-based CPUs, Grace Hopper’s shared-memory compute, and NVIDIA’s inference microservices, Nasdaq has shown how modular acceleration unlocks optionality.

And the third lesson is that trust is built through infrastructure. In an AI-enabled market, where decisions are no longer just assisted by algorithms but often made by them, the reliability and transparency of the underlying compute matters as much as the model itself.

As Larbalestier concludes, “AI is not an add-on. It is part of the market. And to make that work, we need infrastructure that is not just fast, but intelligent by design.” Nasdaq’s work with NVIDIA is not about chasing performance records. It is about rewriting the rules of participation in the market. In doing so, it offers a playbook for every enterprise looking to integrate AI not as a capability, but as a function of trust.