Do not skip the hard part of AI deployment

Share this article

Data quality, real-time context, and production infrastructure will define the future of enterprise AI success. To move from prototype to production at scale, enterprises must confront the realities of poor data, brittle systems, and misguided priorities; otherwise, they risk building intelligent systems that fail at the moment of use.

Most enterprises are approaching AI transformation from the wrong angle. The excitement is real, the pressure is mounting, and the prototypes are multiplying. But very few of these efforts ever make it into production. For Dom Couldwell, Head of Field Engineering at DataStax, that gap between ambition and operational reality is where the real work begins, and where most companies stall.

No AI without data

The foundational truth, Couldwell argues, is that no amount of generative capability can compensate for poor-quality data. “GenAI is not going to magically make your data better,” he says. “If your data is poor to start with, that numerical representation is just a representation of poor data. None of it is going to work if you do not have a data strategy in place.”

The challenge is twofold. First, organisations assume their data is good enough. Second, they misunderstand how semantic search and vector databases work. Turning text, images, or documents into a high-dimensional vector space enables similarity searches at scale, but only if the underlying information is structured, clean, and differentiated.

“You might have a lot of data, but if it is not well differentiated, you hit different problems,” Couldwell explains. “Imagine a telco with hundreds of phone manuals; vectorising all of them will not help if the content is too similar. The model cannot tell one from the other, and your agents will not get the right answers.”

This is where graph data offers a powerful alternative. While vectors capture semantic similarity, graphs represent relationships more effectively. “If your support content includes links to other documents with key answers, vectorisation alone will miss them,” Couldwell continues. “A graph can map the relationships and let you traverse that context dynamically.”

This becomes particularly important as AI systems move from abstract reasoning to direct business utility. Couldwell notes that many customer-facing implementations fall apart because they are built on the assumption that the model has context. “The LLM might understand language, but if you do not feed it the right data in the right way, it is like asking an expert to guess in the dark,” he adds. “You need structure, and you need relevance.”

Real-time relevance needs more than training data

Organisations must also understand the difference between training data and real-time data. “You typically train a model once. It gives you patterns based on history,” says Couldwell. “But what it cannot tell you is what the customer did last week. That is where real-time augmentation comes in.”

The principle is known as retrieval-augmented generation (RAG). This hybrid approach enables enterprises to inject up-to-date, user-specific data into the AI inference process without requiring retraining. The result is better answers, stronger context, and improved safety. “You might not want to train your model on personal information for privacy reasons anyway,” he adds. “But you still want to use it in the flow of conversation.”

This hybrid model is not just about performance. It is also about risk. Companies want to avoid leaking customer data, retraining on sensitive information, or hardcoding business logic into foundation models. “Augmentation lets you keep things flexible,” Couldwell says. “You can change how you respond without retraining the entire model.

“This becomes crucial in domains like customer service, where the information changes frequently, and the tolerance for hallucination is low. You can train a great model, but if it does not know what the customer ordered last week or whether there is a promotion running today, it is going to fail at the moment of truth.”

You cannot scale without infrastructure and honesty

The lure of GenAI often blinds teams to their own infrastructure limitations. “It is very easy to get a prototype working in one region, but if you want global reach, low latency and reliability, you need to stretch your infrastructure,” Couldwell warns. “Where is your application? Where is your data? Do you have replication? What level of consistency do you need?”

Those questions are not new, but GenAI adds weight to them. Vector databases are proliferating, often without clear design principles for maintaining multi-region consistency or handling failures. The result is fragile systems built on experimental foundations. “Think about what you need today and what you will need tomorrow,” Couldwell warns. “That jump to production is where everything starts breaking.”

He is equally blunt about another blind spot: non-determinism. “With GenAI, you will not get the same answer every time,” he says. “Testing becomes harder, monitoring becomes critical. This is not traditional software, and organisations must adapt their thinking fast.”

Couldwell believes organisations are underestimating the significance of these architectural choices. “Latency, consistency, replication, these are not edge cases,” he adds. “They are your baseline for delivering anything useful. And if your monitoring is not up to scratch, you are flying blind with a system that learns and mutates in real time.”

Focus on impact, not novelty

The industry’s fixation on chatbots is understandable but unhelpful. Couldwell is clear that most chatbots were “rubbish but okay” before LLMs came along, and that remains true. “Yes, they are more empathetic now, but what is the ROI?” he says. “You are going from rubbish to better. That is not always transformative.”

The real power of GenAI lies in unglamorous tasks. “Pattern recognition, automation, back office, this is where the biggest value lies,” Couldwell continues. “It is not about AI creating poetry. It’s about washing the dishes. To avoid wasting time on novelty, teams should start with the business problem. People are building cool things because they can. But can your business problem be described on one page? If it can, you probably do not need a multi-agentic autonomous system.”

Couldwell offers a framework to prioritise development efforts. “There are four buckets: traditional applications without AI; simple prompt-response tools; autonomous agents that monitor and act on data; and complex multi-agent systems,” he explains. “The more complex the solution, the higher the cost and maintenance burden. So start with the outcome you need, not the tech you want to play with.”

Balance agility with long-term resilience

Couldwell remains pragmatic about how enterprises can avoid fragmentation and scale GenAI responsibly. A strong centre of excellence, he argues, can play three roles: referee, policeman, and teacher. “You do not want an ivory tower. Focus on enablement. Let developers consume AI as a service without burdening them with training, monitoring, or infrastructure.”

However, flexibility must not come at the cost of architectural integrity. “Composability matters,” Couldwell adds. “If you build something that cannot evolve, you will regret it in six months. Design for plugability, testability, and modularity. Swap in new models. Swap out components. Keep your options open.”

Finally, Couldwell warns against extremes. “You can outsource everything to a black-box vendor, or you can build it all yourself and drown in complexity,” he concludes. “The answer is to understand where your IP lies, what your team is good at, and what you need to own.

“Looking forward, costs will fall, and model diversity will grow. Small language models will matter more. Hybrid models will become the norm. Chips will get faster. However, none of that matters if you cannot reach production. So, it is crucial to be honest about your data. Be honest about your skills. Be honest about the value. AI will not save you from yourself. It will only accelerate what you already have, good or bad.”

Related Posts
Others have also viewed

The next frontier of start-up acceleration lies in the AI tech stack

The rise of generative and agentic AI has redefined what it means to start a ...

Quantum-centric supercomputing will redefine the AI stack

Executives building for the next decade face an awkward truth, the biggest AI breakthrough may ...

The invisible barrier that could decide the future of artificial intelligence

As AI workloads grow denser and data centres reach physical limits, the real bottleneck in ...
Into The madverse podcast

Episode 21: The Ethics Engine Inside AI

Philosopher-turned-AI leader Filippo explores why building AI that can work is not the same as ...