How generative AI is helping filmmakers finally cross the uncanny valley

Mark Venables

AI In Depth, AI for Enterprise, AI Solutions, Exclusives

Share this article

The creation of lifelike digital humans has long been one of cinema’s most elusive and expensive challenges. Now, advances in real-time generative AI are reshaping production workflows, closing the gap between performance and realism, and offering filmmakers new ways to bypass the uncanny valley altogether.

The art of filmmaking has always revolved around illusion. Whether through camera angles, lighting, makeup, or digital effects, the ability to suspend disbelief has defined the medium since its inception. But perhaps the most enduring illusion, and the most difficult to master, has been the digital human. For decades, visual effects teams have pursued a single goal: to replicate the human face with such fidelity that audiences would forget it was ever artificial. Yet even with millions of dollars and armies of artists, the result often fell into what the industry came to call the uncanny valley.

Now, that paradigm is shifting.

Generative AI transforms how digital characters are rendered and how entire productions are conceived, shot, and completed. The boundary between the real and the rendered is removed in the process. At the heart of this shift is Metaphysic, the company behind the viral ‘Deepfake Tom Cruise’ videos and, more recently, a full-length feature film directed by Robert Zemeckis, in which four actors were aged and de-aged over decades, live on set, in real time.

This is not another incremental advance. It fundamentally changes how humans are digitally captured, transformed, and experienced on screen. The implications for production, cost, creativity, and control are profound.

The long road to realism

Ed Ulbrich, now President of Production and Chief Content Officer at Metaphysic, has lived through every major transition in digital filmmaking. His career spans 30 years in visual effects, including the launch of Digital Domain alongside James Cameron and Scott Ross in the early 1990s. “When we started Digital Domain, people picketed us,” Ulbrich says. “We were seen as the enemy. The model makers, the matte painters, the practical effects artists, they thought we were trying to replace them with machines. But most of those people retrained and joined the digital revolution.”

What began as a rebellion eventually became the industry standard. Yet, despite the maturing of CGI, one task remained stubbornly resistant to automation: the believable recreation of human faces. Projects like The Curious Case of Benjamin Button marked early attempts at photoreal digital humans. That film, released in 2009, was among the first to use machine learning and computer vision techniques to reconstruct a younger version of Brad Pitt from performance data. But the process was anything but agile.

“On Button, we had over 200 people working for more than two years to produce 51 minutes of screen time,” Ulbrich adds. “And that was just one character. Four different versions of him, each with its own set of assets. It cost tens of millions and was excruciatingly slow.”

The visual fidelity was impressive for its time, but the workflow was rigid. Artists built 3D models from scans, simulated skin, muscle, and blood flow, retargeted performances from motion capture, rendered subsurface lighting effects, and then composited thousands of layers back together. Every stage risked fidelity loss, with likeness and emotional nuance diminishing as assets moved through the pipeline.

The cumulative result was often close but not close enough. Human perception is unforgiving; it is trained from birth to detect even the slightest facial movement or expression irregularities. Enter the uncanny valley. “Every human project I ever worked on went there,” Ulbrich adds. “Into the valley. Some made it out. Most didn’t.”

A TikTok that changed everything

During the pandemic, with production shut down and the industry in stasis, Ulbrich and a group of seasoned VFX veterans began holding informal Zoom sessions, often fuelled by tequila and nostalgia. During one of these evenings, someone shared a TikTok video of what appeared to be Tom Cruise. It was not just a good impression, an impossibly accurate one.

The footage was not produced by a major studio. It was created by a young engineer in Belgium and an actor in Los Angeles. There is no professional pipeline, no Hollywood budget, just AI. “It looked better than anything we were doing in studios,” Ulbrich admits. “It wasn’t just accurate. It was expressive. Lifelike. People were arguing whether ILM had done it in secret.”

That video was the first encounter many in the industry had with Metaphysic, the newly formed company behind the effect. And while the deepfake label stuck, what they built was far more advanced. Unlike traditional CGI, which uses 3D models, textures, and rigs, Metaphysic’s process is entirely neural. Trained on ethically sourced datasets with full consent from the actors involved, their models use generative networks to transform and synthesise faces in a single, real-time pass. “It’s not procedural,” Ulbrich says. “It’s data science. And it’s not just a step forward. It’s a step around.”

Real-time filmmaking on set

The real test came when director Robert Zemeckis and visual effects supervisor Kevin Baillie were looking for a way to shoot a film that required four lead actors to play themselves across decades, from teenagers to their eighties. The scope far exceeded what had been done on Benjamin Button, and the budget could not accommodate years of post-production.

Zemeckis and Baillie decided to test three AI companies against four top-tier VFX studios. One of those AI companies was Metaphysic. The outcome was decisive. “When I saw that test, it changed everything,” Ulbrich says. “This wasn’t just a new tool. It was a new language for filmmaking.”

Jo Plaete, Metaphysic’s Chief Innovation Officer and VFX Supervisor led the technology deployment at Pinewood Studios. Working from a GPU-powered mobile lab just off-set, the team installed a live pipeline that processed footage from the cameras and returned fully transformed shots of the actors in real time. “This gave the director a completely new feedback loop,” Plaete explains. “Zemeckis could look at the monitor and see 20-year-old Tom Hanks, not 67-year-old Tom Hanks. And not after six months of post, live, on the day.”

This capability transformed every aspect of production. The director of photography could light scenes for the younger versions of the characters. Makeup and hair could be adjusted based on how they affect neural rendering. Even the actors began using the technology to rehearse with what Metaphysic called the ‘Youth Mirror’, a feedback tool that showed them how their current expressions mapped onto their younger selves. “Tom Hanks could lift his brow and see how that changed the emotional read of young Tom,” Plaete adds. “They took that knowledge back to their trailers, honed their performances, and brought it back to set.”

Control without compromise

What separates Metaphysic’s approach from most AI-driven tools is its balance between automation and creative control. Neural networks, by design, are black boxes. What goes in and what comes out are visible, but the process in between is often opaque. This poses a problem for filmmakers who are used to giving precise notes on performance, lighting, or camera angles.

To address this, Metaphysic developed tools that allow artists to access and manipulate the latent space of the neural models. This means they can control specific features, such as eyelines, expressions, and lighting cues, without degrading the overall output. “We call it neural animation,” Plaete says. “It allows us to make surgical adjustments inside the model without compromising performance fidelity.”

The process also allows for sophisticated compositing, such as blending facial data from older actors into performance capture from younger ones. In one instance, the team used training data from two elderly lookalike actors to supplement Robin Wright’s transformation into her future self. Prosthetics alone could not achieve the desired look; neural modelling could.

Every frame processed by the system was run through several stages, recognition, transformation, and compositing, before being returned to the monitor. The entire round trip took less than one frame of latency. At this speed, neural rendering was no longer a post-production tool. It had become part of principal photography.

A new production economy

Beyond the creative possibilities, the economic implications are hard to ignore. Producing 51 minutes of photoreal digital humans using traditional CGI might take two years and 200 artists. Metaphysic’s neural pipeline can now deliver the same within the constraints of a regular shoot. “The film wouldn’t have been made without this technology,” Ulbrich explains. “It wasn’t just a creative decision. It was a financial one.”

For studios, this means an opportunity to tell stories that would otherwise be cost-prohibitive. For independent filmmakers, it means access to tools once reserved for blockbuster budgets. It offers actors a controlled, consent-based way to extend their creative range and legacy.

Unlike synthetic avatars or content made without permission, Metaphysic’s work is built around ethical AI practices. Every model is trained using licensed data with full actor consent. If that consent is revoked, the model is retired. “No licence, no consent, no AI,” Ulbrich says. “That is non-negotiable.”

The new cinematic language

Ulbrich’s narrative has a recurring theme of history repeating itself. When Digital Domain emerged in the early 1990s, the backlash was fierce. The fear then, as now, was job displacement and the erosion of craftsmanship. What followed was the creation of an entirely new industry, one that eventually employed hundreds of thousands of artists worldwide.

The arrival of AI in filmmaking feels similar but faster. “This is David Fincher’s prediction, which has come true,” Ulbrich says. “He told us back then that you’ll be doing this on a laptop one day. At the time, we laughed. Now, it’s real.” What used to take months now happens in milliseconds. What once needed 1,000 artists now takes a fraction. Yet far from replacing human creativity, this new paradigm amplifies it by eliminating friction, reducing guesswork, and restoring spontaneity to an art form long paralysed by process.

The uncanny valley, long considered a digital dead end, may no longer be the obstacle it once was. With generative AI, filmmakers are starting not in the shadows of the valley, but on the far side of it, where likeness, emotion, and storytelling converge in real time. And in that convergence lies not just a technical breakthrough but a cinematic one.