99% Failure Rate: The Data Quality Problem Nobody Wants to Admit

‍

There's a paradox unfolding across enterprises today. Organizations are deploying conversational AI with noticeably less resistance than they face with traditional BI implementations. Users tend to adopt it more eagerly. And in doing so, they're amplifying every hidden data quality problem that companies postponed addressing.

‍

I've watched this play out across dozens of client engagements. Years of BI migrations taught me that change management in tech is painful: Teams resist, they cling to old processes. Getting an organization to accept new ways of working requires sustained effort.

‍

But AI is different in one important way: Users pick up and start using AI tools more quickly than they did with older BI platforms. People aren't fighting the change the way they typically do with major software implementations. At least initially.

‍

Which creates a specific problem.

‍

The Paradox of Low-Friction Adoption

‍

Adoption isn't the same as success. When these AI initiatives take off and organizations are producing answers off incorrect or inaccurate data, the sprawl is huge. The AI didn't fail, – it succeeded. It distributed unreliable information at organizational scale and velocity that wasn't possible before.

‍

Traditional BI worked through a filter. A request came in, an analyst handled it, and that analyst understood the context. She knew which definitions were canonical and which were approximations. She caught inconsistencies. The bottleneck protected data quality.

‍

AI removes that protection. A vice president asks a question, the chatbot answers with confidence, a decision gets made. Hours later, another team questions the number. Turns out the underlying metric was calculated differently across systems. The damage spreads through the organization before anyone realizes the foundation was never trustworthy. Data quality becomes more and more important in this environment because there are fewer guardrails and fewer controls being implemented. It's a new area and new technology. The critical point I keep making to leaders is simple: if you don't address data quality early on, it's really hard to take it back.

‍

Once users are embedded in using the tool, rolling it back means disappointing them and admitting the foundation was weaker than represented. The organizational cost of reversal exceeds the cost of doing it right initially.

‍

The Vast Chasm Between Experiment and Production

‍

The numbers reveal how acute this problem is. According to IDC research, out of 400 experiments, 40 make it into pilots, and only four make it into production. That's 1%.

‍

One percent. This isn't a failure of AI technology itself. Organizations pursuing AI early were under pressure to prove ROI or justify continued spending, so they built proofs of concept. Some worked. Many didn't.

‍

I've seen the stall point repeatedly in my work. When organizations try to scale beyond a single use case or department, the data foundation cracks. While they've been successful on pointed use cases or proof of concepts with a subset of their data or a particular department, they need to prioritize the gaps in their foundation before they roll it out to the enterprise.

‍

Organizations can build impressive AI pilots against clean, well-defined data in a single business unit. Enterprise scale demands something entirely different: It calls for data consistent across domains, definitions everyone agrees on, governance structures, clear ownership, and visibility into data quality.

‍

What Was Once Optional Is Now Essential

‍

This is why investment priorities shifted so dramatically. A lot of things that used to be nice-to-haves in a data platform have suddenly become must-haves for AI.

‍

Data cataloging and lineage used to be IT infrastructure, – tools for troubleshooting, useful but not urgent. Now it's basically a prerequisite for cognitive computing (AI, ML, LLMs, emerging agentic, neural networks). The reason is that humans navigate ambiguity in ways machines cannot. A human analyst asks clarifying questions, applies judgment, negotiates agreement on what a metric means.

‍

If you want AI to understand your business, you need clear definitions and metadata in context. You can have a human analyst go around the office and get agreement on a KPI. An AI assistant can't do that.

‍

Machines need that work done upfront, with definitions documented, lineage clear, consistency enforced. The same applies to data quality observability and semantic documentation. These are all foundational things that help keep a platform stable. Before, they were viewed as operational hygiene, meaning they were important but not always tied directly to a business outcome, so they weren't necessarily prioritized. But without them you can't scale AI safely or confidently across the organization.

‍

What a Data Product Actually Is

‍

The organizations that successfully scale AI have one thing in common: they treat data as a product. It’s a phrase we’ve all heard in recent years, but it’s not purely metaphorical or philosophical. It’s a practice, not an abstraction. As Stewart Bond, Research Vice President of Data Intelligence and Integration Software Service at IDC, has noted, this means:

‍

"Treating internal data as if it were something being sold with appropriate service level agreements of quality control and applicability to internal use, especially when it comes to AI."

‍

A data product has three characteristics.

‍

1. Access: "It needs to be accessible within all the appropriate security and privacy classifications that the product exists within. All of those rules and policies need to be in place before that data product could be accessed."

‍

2. Business Value: "It can't just be a table and a database. It needs to be something that is part of a business process that drives some sort of business outcome and business value."

‍

3. Ownership: "Someone in the organization needs to be accountable for that data product from its inception through to its sunsetting and all the version control that needs to happen in between." This is critical. A data product isn't orphaned. It has an owner responsible for its evolution, quality, and eventual retirement.

‍

That accountability matters profoundly.

‍

Product Management Discipline Applied to Data

‍

Organizations with data product discipline have clear domain ownership, versioning and release practices for data, service level agreements defining quality and availability, and semantic layers documenting what metrics mean and the business logic governing them.

‍

When a new AI initiative arrives, these organizations treat it as a feature request against an existing product. They know how to plan for it, resource it, extend existing playbooks. The AI initiative becomes just another feature on top of their existing data product rather than a wholesale rethinking of how they manage their data.

‍

Request-driven organizations operate differently. They build data assets on demand in response to specific requests. They have no product playbook, no versioning practice, no clear ownership. When AI arrives as an imperative, they're forced to build foundational disciplines while delivering the AI capability simultaneously.

‍

What Requires Attention Now: Product Management for Data

‍

For data leaders navigating pressure to "move faster on AI," the job isn't solely making a technology selection, but also establishing product management discipline for data. Can you deploy a conversational AI interface today without creating data quality catastrophes? If the answer is that you'd need to resolve definitional conflicts first, you're not ready. Do you have explicit ownership of data domains, not by report but by domain? Who owns customer data, financial data, operational data? Who's accountable when an AI system produces wrong answers?

‍

Have you documented business logic in a semantic layer so both AI and humans understand what metrics mean, their origins, and the logic governing them? Do you have versioning and release practices for data so you can roll back changes the way you would code?

‍

And critically: do you have the organizational discipline to treat data as a product, not a project? With versioning, with SLAs, with clear producers and consumers, with owners accountable end-to-end?

‍

Watch the IDC Webcast

‍

To dive deeper into these topics, check out the IDC webcast with Stewart Bond and DI Squared COO Trey Smith. As a companion, pick up your free copy of the IDC InfoBrief to learn how leaders are handling data management in the new normal.

‍

99% Failure Rate: The Data Quality Problem Nobody Wants to Admit