Agentic AI has big trust issues

Enterprises are deploying AI agents at a rapid pace, but serious doubts about agentic AI accuracy suggest potential disaster ahead, according to many experts.

The irony facing AI agents is that they need decision-making autonomy to provide full value, but many AI experts still see them as black boxes, with the reasoning behind their actions invisible to deploying organizations. This lack of decision-making transparency creates a potential roadblock to the full deployment of agents as autonomous tools that drive major efficiencies, they say.

The trust concerns voiced by many AI practitioners don’t seem to be reaching potential users, however, as many organizations have jumped on the agent hype train.

About 57% of B2B companies have already put agents into production, according to a survey released in October by software marketplace G2, and several IT analyst firms expect huge growth in the AI agent market in the coming years. For example, Grand View Research projects a compounded annual growth rate of nearly 46% between 2025 and 2030.

Many agentic customer organizations don’t yet grasp how opaque agents can be without the right safeguards in place, AI experts suggest. And, even as guardrails roll out, most current tools aren’t yet sufficient to stop agent misbehavior.

Misunderstood and misused

Widespread misunderstandings about the role and functionality of agents could hold back the technology, says Matan-Paul Shetrit, director of product management at agent-building platform Writer. Many organizations view agents as similar to straightforward API calls, with predictable outputs, when users should treat them more like junior interns, he says.

“Like junior interns, they need certain guardrails, unlike APIs, which are a relatively simple thing to control,” Shetrit adds. “Controlling an intern is actually much harder, because they can knowingly or unknowingly do damage, and they can access or reference pieces of information that they shouldn’t. They can hear Glenda talking to our CIO and hear something that is proprietary information.”

The challenge for AI agent developers and user enterprises will be to manage all these agents that are likely to be deployed, he says.

“You can very easily imagine that an organization of 1,000 people deploys 10,000 agents,” Shetrit contends. “They’re no longer an organization of 1,000 people, they’re an organization of 11,000 ‘people,’ and that’s a very different organization to manage.”

For huge corporations like banks, the agent population could reach 500,000 over time, Shetrit surmises — a situation that would require entirely new approaches to organizational resource management and IT observability and supervision.

“That requires rethinking your whole org structure and the way you do business,” he says. “Until we as an industry solve that, I don’t believe that agent tech is going to be widespread and adopted in a way that delivers on the promise of agents.”

Many organizations deploying agents don’t yet realize there’s a problem that needs to be solved, adds Jon Morra, chief AI officer at advertising technology provider Zefr.

“It’s not well understood in the zeitgeist how many trust issues there are with agents,” Morra says. “The idea of AI agents is still relatively new to people, and a lot of times they’re a solution in need of a problem.”

In many cases, Morra argues, a simpler, more deterministic technology can be deployed instead of an agent. Many organizations deploying the large language models (LLMs) that power agents still appear to lack a basic understanding of the risks, he says.

“People have too much trust in the agents right now, and that’s blowing up in people’s faces,” he says. “I’ve been on a number of calls where people who are using LLMs are like, ‘Jon, have you ever noticed that they get math wrong or sometimes make up stats?’ And I’m like, ‘Yeah, that happens.’”

While many AI experts see faith in agents improving over the long term as AI models improve, Morra believes complete trust will never be warranted because AI will always have the potential to hallucinate.

Workflow friction in autonomy distrust

While Morra and Shetrit believe AI users don’t understand the agent transparency issue, G2’s October research report notes a growing trust in agents to perform some tasks, such as autoblocking suspicious IPs or rolling back failed software deployments, although 63% of respondents say their agents need more human supervision than expected. Less than half of those surveyed say they trust agents in general to make autonomous decisions, even with guardrails in place, and only 8% are comfortable giving agents total autonomy.

Tim Sanders, chief innovation officer at G2, disagrees with some of the warnings: He sees a lack of trust in agents as more of a problem than a lack of transparency in the technology. While distrust of a new technology is natural, the promise of agents is in their ability to act without human intervention, he says.

The survey shows nearly half of all B2B companies are buying agents but not giving them real autonomy, he notes. “This means human beings are having to evaluate and then approve every action,” Sanders says. “And that seems to defeat the entire purpose of adopting agents for the sake of efficiency, productivity, and velocity.”

This trust gap could be costly to organizations that are too cautious with agents, he contends. “They will miss out on billions of dollars of cost savings because they have too many humans in the loop, creating a bottleneck inside agentic workflows,” Sanders explains. “Trust is hard-earned and easily lost. However, the economic and operational promise of agents is actually pushing growth-minded enterprise leaders to extend trust rather than retreat.”

Care required

Other AI experts caution enterprise IT leaders to be careful when deploying agents, given the transparency problem AI vendors still need to solve.

Tamsin Deasey-Weinstein, leader of the AI Digital Transformation Task Force for the Cayman Islands, says AI works best with a human in the loop and stringent governance applied, but a lot of AI agents are over-marketed and under-governed.

“Whilst agents are amazing because they take the human out of the loop, this also makes them hugely dangerous,” Deasey-Weinstein says. “We’re selling the prospects of autonomous agents when what we actually have are disasters waiting to happen without stringent guardrails.”

To combat this lack of transparency, she recommends limiting agents’ scope.

“The most trustworthy agents are boringly narrow in their ability,” Deasey-Weinstein says. “The broader and freer rein the agent has, the more that can go wrong with the output. The most trustworthy agents have small, clearly defined jobs and very stringent guardrails.”

She recognizes, however, that deploying highly targeted agents may not be appealing to some users. “This is neither saleable nor attractive to the ever-demanding consumer that wants more work done for less time and skill,” she says. “Just remember, if your AI agent can write every email, touch every document, and hit every API, with no human in the loop, you have something you have no control over. The choice is yours.”

Many AI experts also believe autonomous agents are best deployed to make low-risk decisions. “If a decision affects someone’s freedom, health, education, income, or future, AI should only be assisting,” Deasey-Weinstein says. “Every action has to be explainable, and with AI it is not.”

She recommends frameworks such as the OECD AI Principles and the US NIST AI Risk Management Framework as guides to help organizations understand AI risk.

Observe and orchestrate

Other AI practitioners point to the emerging practice of AI observability as a solution to agent misbehavior, although others say observability tools alone may not diagnose an agent’s underlying issues.

Organizations using agents can deploy an orchestration layer that manages lifecycle, context sharing, authentication, and observability, says James Urquhart, field CTO at AI orchestration vendor Kamiwaza AI.

Like Deasey-Weinstein, Urquhart advocates for agents to have limited roles, and he compares orchestration to a referee that can oversee a team of specialist agents. “Don’t use one ‘do-everything’ agent,” he says. “Treat agents like a pit crew and not a Swiss army knife.”

AI has a trust problem, but it’s an architectural issue, he says.

“Most enterprises today can stand up an agent but very few can explain, constrain, and coordinate a swarm of them,” he adds. “Enterprises are creating more chaos if they don’t have the control plane that makes scale, safety, and governance possible.”