Your Data Team Is Shipping Faster Than It Can Be Trusted. That’s a Problem.

Q: Only 7% of enterprises say their data is completely ready for AI. What are the other 93% actually missing?

Mostly governance discipline rather than technical capability. The gaps are consistent: unclear data ownership, insufficient lineage documentation, inconsistent quality validation, and access controls not reviewed for a context where agents rather than humans are the primary consumers. Most organisations have the tooling to address these things. What they have not done is treat the governance work as a prerequisite for agent deployment rather than a parallel workstream that can catch up later.

Q: Gartner found organisations seeing AI returns invest up to four times more in data foundations. Does that mean smaller organisations cannot compete?

Not necessarily, because the investment is proportional rather than absolute. The Gartner finding is about prioritisation as much as spend. A smaller organisation with a well-governed, modest data estate will outperform a large one with sprawling, poorly governed infrastructure when it comes to deploying agents reliably. Scale creates more surface area for governance failures, not more immunity to them.

dbt Labs released its fourth annual State of Analytics Engineering Report this week, and one number stands out: trust in data has become the single highest organisational priority among data professionals, rising from 66% to 83% in a single year. That is the steepest year-on-year increase of any metric the survey has ever recorded.

On the surface, it reads as a straightforward endorsement of data quality as a business priority. The context makes it more uncomfortable. Trust has shot up the agenda not because organisations have developed a new appreciation for clean pipelines, but because the consequences of dirty ones have changed. AI agents are running on top of organisational data. When the data is wrong, the agent does not hesitate. It acts – and 71% of data professionals surveyed are now directly concerned about incorrect or hallucinated outputs reaching stakeholders as a result.

The gap hiding inside the productivity numbers

The data world has embraced AI tooling at pace. 72% of data professionals now prioritise AI-assisted coding in their development workflows – a figure that reflects a genuine shift in how engineering work gets done. Output has increased. Delivery has accelerated. By most surface measures, data teams are performing better than they were twelve months ago.

But the same survey finds that only 24% prioritise AI-assisted pipeline management: the testing, observability, and validation disciplines that determine whether what gets shipped is actually reliable. That gap – between the investment in producing data faster and the investment in governing what gets produced – is where the trust problem lives.

dbt Labs describes the dynamic as acceleration without stabilisation compounding risk. It is a precise framing. The issue is not speed. It is speed without the corresponding investment in what makes that speed safe. Persistent obstacles like ambiguous data ownership, cited by 41% of respondents, and poor data quality remain largely unchanged year on year, even as the volume of data being produced and the number of systems consuming it have both grown substantially.

Why this matters more now than it did before

Data quality problems are not new. What is new is the failure mode. A reporting tool or dashboard that surfaces a bad number gets questioned by a human before it drives a decision. That feedback loop – slow, imperfect, but real – has historically been the safety net underneath weak data governance. Agentic AI removes it.

An agent operating autonomously across scheduling, communications, record modification, or downstream workflow triggers does not pause to sanity-check the data it is acting on. Informatica’s CDO Insights 2026 report, drawn from 600 data leaders globally, found that three out of four organisations admit their governance has not kept pace with AI adoption. For agentic AI specifically, half of leaders still cite data quality and retrieval as their biggest operational challenge. The Salesforce Connectivity Report 2026 found that 50% of AI agents currently operate in isolated silos rather than as part of a coherent, integrated system – which means errors do not just propagate, they propagate in ways that are difficult to trace.

The Cloudera and Harvard Business Review Analytic Services study, published in March 2026 and drawing on over 230 senior leaders involved in AI data decisions, quantifies the exposure directly: only 7% of enterprises say their data is completely ready for AI adoption, and more than a quarter report their data is not very – or not at all – ready. That figure has not attracted the attention it deserves, partly because organisations have been focused on AI model selection and infrastructure investment rather than the foundations those choices sit on.

What the returns data actually shows

Gartner’s April 2026 AI report offers the sharpest view of where this lands commercially. Only 39% of technology leaders believe their current AI efforts will improve financial performance. The analysis of what separates organisations that do see returns is unambiguous: they invest up to four times more in data quality, governance, skilled talent, and change management than those that do not. Model selection, infrastructure choices, platform decisions – these are not where the gap opens up. The gap opens up in the foundations.

Informatica found that 57% of organisations admit data reliability is a top barrier to their AI programmes even as investment in those programmes continues to grow. There is a pattern here that is worth naming clearly. Organisations are funding AI on top of data infrastructure that they know is not reliable, in the expectation that the AI will somehow compensate. It does not. Gartner’s finding that only 23% of technology leaders feel confident in their organisation’s ability to manage governance and security for AI systems reflects an industry that is, in aggregate, building faster than it is building well.

What ‘agent-ready’ data infrastructure actually requires

The organisations pulling ahead on this are not running separate data quality programmes alongside their AI programmes. They are treating data governance as a prerequisite for agent deployment, not a parallel workstream. The distinction matters because the sequencing matters. An agent that goes into production on ungoverned data creates a remediation problem that is significantly harder to solve than the governance problem that preceded it.

In practice, agent-ready data infrastructure requires four things to be working together. Ownership needs to be unambiguous – every dataset has a named owner accountable for its quality and currency. Lineage needs to be documented so that when an agent acts on data, the provenance of that data is traceable. Validation needs to be embedded in the pipeline, not applied retrospectively when something goes wrong. And observability needs to extend to agent behaviour, not just data movement – so that anomalies in how agents are consuming and acting on data are detectable before they compound.

None of these are technically exotic. Most data teams understand them. The challenge is that AI adoption timelines have consistently outpaced the maturity of the governance work, and the pressure to ship agents has made it harder, not easier, to insist on the foundational work first. The dbt Labs finding that 83% of organisations have made data trust their top priority suggests the industry has recognised the problem. The 71% concerned about incorrect outputs suggests they are also already living with its consequences.

Q&A: Data Foundations for the Agentic Era

Why has data trust suddenly become the top organisational priority for data teams when it has been a known problem for years?
The problem has not changed – the failure mode has. Bad data in a reporting environment gets caught by a human before it drives a decision. Bad data in an agentic environment gets acted on autonomously, often across multiple downstream systems, before anyone knows something has gone wrong. The dbt Labs 2026 State of Analytics Engineering Report found that 71% of data professionals are now directly concerned about incorrect or hallucinated outputs reaching stakeholders. That is not a new sensitivity to data quality. It is a response to a new set of consequences.

What does the gap between 72% AI-assisted coding adoption and 24% AI-assisted pipeline management adoption actually represent in practice?
It represents a data team that is producing more, faster, on foundations that are not being strengthened at the same rate. Coding tools accelerate output. Testing, observability, and validation – the pipeline management disciplines – determine whether that output can be trusted. When investment in the former significantly outpaces investment in the latter, quality issues accumulate quietly. They tend to surface at the worst possible moment, which in 2026 is increasingly when an agent is already in production and already acting on the data.

Only 7% of enterprises say their data is completely ready for AI. What are the other 93% actually missing?
Mostly governance discipline rather than technical capability. The Cloudera and Harvard Business Review study found that the gaps are consistent: unclear data ownership, insufficient lineage documentation, inconsistent quality validation, and access controls that have not been reviewed for a context where agents rather than humans are the primary consumers. Most organisations have the tooling to address these things. What they have not done is treat the governance work as a prerequisite for agent deployment rather than a parallel workstream that can catch up later.

What does it mean in practice for an agent to act on bad data?
Consider an agent managing supplier communications. If the underlying data contains stale contract terms, an incorrect contact record, or an unresolved duplication between two supplier profiles, the agent does not flag an anomaly – it sends the wrong communication to the wrong contact at the wrong terms, potentially triggering a downstream procurement or compliance issue. The same failure in a human workflow is caught during review. In an agentic workflow, it is already done. The Salesforce Connectivity Report 2026 found that 50% of AI agents operate in isolated silos, meaning these errors are likely to be invisible until they surface somewhere unexpected.

Gartner found organisations seeing AI returns invest up to four times more in data foundations. Does that mean smaller organisations cannot compete?
Not necessarily, because the investment is proportional rather than absolute. The Gartner finding is about prioritisation as much as spend. A smaller organisation with a well-governed, modest data estate will outperform a large one with sprawling, poorly governed infrastructure when it comes to deploying agents reliably. Scale creates more surface area for governance failures, not more immunity to them.

What are the four things agent-ready data infrastructure requires, and which is most commonly missing?
Unambiguous ownership, documented lineage, embedded validation, and observability that extends to agent behaviour rather than just data movement. The most commonly missing element, in our experience, is the last one. Most organisations have some form of pipeline monitoring. Far fewer have built the observability layer that tells you how agents are consuming data and whether their behaviour is consistent with what you would expect – which is the layer that makes anomalies detectable before they compound into a production incident.

Closing the gap before it closes you

The Gartner data on AI returns is not a commentary on the quality of AI technology. It is a commentary on the quality of the infrastructure underneath it. Organisations that have invested in data foundations are seeing returns. Those that have not are carrying both the cost of the AI investment and the growing liability of what happens when agents act on data that cannot be trusted.

At Vertex Agility, our Data practice is built around exactly this challenge. We help technology leaders move from data infrastructure that stores and delivers to data infrastructure that is governed, trusted, and agent-ready – the kind of foundation where AI investment produces reliable returns rather than expensive uncertainty. If your organisation is accelerating AI deployment on data foundations that have not kept pace, we can help you close that gap before it becomes a production problem.

Get in touch to find out how we can help.