The hidden challenges of data management in AI

Mark Venables

AI In Depth, AI for Enterprise, AI Solutions, Exclusives

Share this article

AI’s transformative potential hinges on robust data management, yet many organisations struggle with data trust, integration, and governance. As Mark Venables explains addressing these challenges is critical to unlocking reliable, high-impact AI outcomes and driving meaningful digital transformation.

Artificial intelligence (AI) has quickly become a defining force in digital transformation. From predictive maintenance in manufacturing to personalised customer experiences in retail, AI’s potential to drive efficiency and innovation is undisputed. However, the rush to adopt AI often overlooks its foundational requirement: robust data management. Even the most advanced AI models cannot deliver meaningful results without trusted, high-quality data.

“AI is only as good as the data it relies on. Presently, 80 per cent of data used for decision-making is mistrusted,” Jay Limburn, Chief Product Officer at Ataccama, says. “This mistrust represents a significant barrier for organisations attempting to make confident, data-driven decisions. To unlock the true potential of AI, we need to instil trust in data by ensuring it is accurate, reliable, and complete.”

Data management is no longer a back-office function. It has become a strategic priority, enabling businesses to maximise the value of their data assets. Yet, challenges such as siloed information, poor integration practices, and inadequate governance continue to plague organisations, undermining their ability to realise AI’s potential.

Why data trust matters

“Data Trust is all about creating confidence in data to support informed decision-making,” Limburn continues. “For 10 to 20 years, organisations have viewed data as an asset, especially in analytics. Now, they are seeking to unlock that asset’s value fully. Companies like Uber demonstrate this shift, and they are not just in the taxi business; they are data-driven companies that disrupt markets by using data to drive decisions.”

Data trust ensures that users can rely on the data they access, enabling confident decision-making and better business outcomes. However, data trust requires more than just accuracy. Organisations must address issues such as completeness, consistency, and transparency. Without these elements, even the most sophisticated AI models will produce unreliable or skewed results.

“Trust in data is the foundation of any successful AI initiative,” Limburn adds. “If you cannot trust the data feeding your AI models, you cannot trust the outputs they generate. This lack of trust extends across industries, from healthcare to financial services. Building confidence in data is a technical challenge and a strategic imperative.”

Breaking down silos

It is rare for companies to have all their data in one place. Data is often scattered across various systems, departments, and locations. “Even when organisations centralise data into platforms like Snowflake or Databricks, the challenge remains: data quality and trust must be established before that data can be fully leveraged,” Limburn explains. “Data silos remain a persistent issue, preventing organisations from fully leveraging their data assets. Siloed information often results in duplicated efforts, inconsistent data sets, and a lack of visibility across the organisation. These challenges are exacerbated as companies adopt more complex IT landscapes, incorporating legacy and modern cloud platforms.”

While centralising data is a standard solution, Limburn emphasises the importance of balancing centralisation with decentralised ownership. “There is a trend toward a ‘data mesh’ model, where data ownership is distributed across business units but governed by a central strategy. This approach balances the need for control with the flexibility required for innovation.”

Breaking down silos also requires robust data integration practices. “Not all data will ever be in one place, so a multi-vendor approach is inevitable,” Limburn says. “Ensuring trust across this complex vendor landscape requires a data management framework prioritising quality and governance.”

Governance and quality as the pace car for AI

“With Gen AI and AI initiatives becoming critical for business transformation, there is an increased focus on data strategies,” Limburn notes. “Governance and quality are non-negotiable prerequisites. They form the foundational layer upon which any AI strategy is built, ensuring that the data feeding into AI systems is accurate, consistent, and reliable.”

Governance is often seen as a compliance requirement, but Limburn argues it is also a strategic enabler. “Governance identifies where data resides, but adding a trust layer ensures that this data is actionable and reliable,” he says. “This enables organisations to go beyond compliance and use data to drive business outcomes.”

Data quality is equally important, as AI models are only as good as the data they consume. “We consolidate all data, allowing organisations to identify what they have and establish a governance foundation,” Limburn explains. “This approach includes measuring accuracy, completeness, and precision, as well as tracking data lineage. Knowing where data comes from adds value, especially when differentiating between reliable sources and undocumented local databases.”

Limburn likens governance and quality to a Formula One pace car. “The data strategy is like a pace car that sets the track for the AI strategy to race ahead. Without this foundational layer, AI initiatives will struggle to deliver meaningful results.”

The role of machine learning in data management

Machine learning plays a significant role in automating data management. “Since 2016, we have used ML to streamline repetitive tasks like data classification and quality checks,” Limburn continues. “Recently, we introduced generative AI to automate content generation, including business rules, SQL statements, and documentation.”

Automation enhances efficiency and ensures that data is continuously monitored and improved. By leveraging ML and AI, organisations can identify inconsistencies, flag sensitive information, and proactively address compliance risks. These capabilities are critical in today’s complex regulatory environment, where businesses must navigate stringent data protection laws and standards.

“Generative AI allows us to automate previously manual processes, freeing up data teams to focus on higher-value activities,” Limburn says. “For example, our platform can automatically generate business rules and documentation, reducing the workload for data teams while improving accuracy and consistency.”

The evolving role of the Chief Data Officer

The Chief Data Officer (CDO) role has shifted from purely technical to value-driven. CDOs are now expected to demonstrate how data can drive business outcomes. This requires a blend of business acumen and technical expertise and the ability to align data strategies with organisational goals.

“Today’s CDOs are tasked with navigating increasingly complex data landscapes,” Limburn explains. “They must balance technical requirements with business priorities, ensuring data strategies support operational efficiency and innovation. This shift reflects the growing recognition of data as a strategic asset.

“For non-technical stakeholders like CMOs, data governance must be positioned as a tool that enhances their performance. Collaboration between CDOs and business leaders is essential, ensuring data strategies align with broader organisational objectives.”

A case in point: T-Mobile’s transformation

T-Mobile’s experience demonstrates the power of effective data management. After a significant data breach in 2021, the telecommunications giant partnered with Ataccama to overhaul its data management approach. Implementing Ataccama’s platform enabled T-Mobile to scan and classify sensitive information across thousands of databases, enhancing data trust and compliance.

“Our automated quality checks dramatically reduced data accuracy issues,” says Daniel West, T-Mobile’s Data Management Lead. “This self-improving, closed-loop solution now scans over 22,000 databases and 5,000 applications, encompassing more than 8 petabytes of data. The remarkable results have saved us $350 million by mitigating PII leakage risk and $50 million by eliminating redundant systems.”

Turning chaos into mastery

“The future of AI lies in trust and automation,” Limburn concludes. “AI is now central to every business leader’s agenda, but its success depends on a solid data strategy. Without that foundational layer, any AI initiative will struggle to deliver meaningful results.”

The message is clear for organisations looking to harness AI’s potential: to achieve AI mastery, they must first master their data. By prioritising trust, integration, and governance, businesses can unlock AI’s full potential and drive transformative change in an increasingly data-driven world.