AgentOps: The Agent Governance Challenge

Some reports put the number of organisations with agentic AIs in full production at just 11% (Gartner), with infrastructure and governance identified as the primary blockers rather than AI capabilities. While that number already seems conservative, the challenge around continued governance of agents will cause some further hesitancy.

That gap between what agents can do and what organisations can safely oversee, is quietly becoming the most important operational challenge in enterprise AI. Agents don’t just execute instructions, they reason, decide, call tools, coordinate other agents, and run for hours or days without human intervention. The discipline designed to track and manage them (which we can call AgentOps) is still in its infancy.

Another Gartner stat projects that over 40% of agentic AI projects will be cancelled or fail to reach production by 2027, not because the models aren’t capable, but because enterprises cannot pass the governance, auditability, and cost control requirements that production demands.

We’ll leave the cost argument for another day, but the failure pattern is consistent. Agents that work beautifully in controlled demos can break in production, not because the AI reasoning fails, but because the operational layer around it hasn’t been completely defined. This is a huge challenge for businesses and regulators looking to formalise deployment rules. Enter AgentOps, an emerging set of practices for managing the full lifecycle of AI agents in production.

What AgentOps actually means

The core disciplines of AgentOps are observability, governance, and management. Every tool an agent invokes should be observable and auditable, operators need to understand what happened and why. Something the cloud and various other softwares have successfully done for years. Agents should run under scoped identity with least-privilege permissions, and governance must enforce who can build, deploy and operate agents, and what runtime behaviours are permitted.

In production, agents inevitably encounter new inputs, evolving data and changing systems and a major emerging concern is so-called ‘agent drift’, where agents perform differently than during evaluation due to changes in goals, context, reasoning, or tool interactions.

The question every board should be asking

As AI agents gain access to enterprise systems, be it ERP, CRM, financial workflows, procurement etc, the governance question becomes existential rather than technical. Where traditional enterprise security ensures access control, modern enterprise AI requires governance of actions, context, and execution in real time.

Thankfully, the regulatory environment is catching up fast. The EU AI Act (fully applicable by August 2026), emerging US frameworks, and sector-specific regulations in financial services, healthcare, and other key verticals are all moving toward mandatory auditability of automated decision-making. An AI agent that approved a transaction, rerouted a shipment, or modified a contract cannot simply say it doesn’t remember why it did it.

Getting governance right

Success lies in treating agent deployment not as a technology project but as an operational discipline, with the same rigour applied to release management, incident response, and compliance that enterprises would apply to any other production system.

That means the full logging of every agent action and decision; human-in-the-loop escalation for high-stakes or ambiguous situations; version control for agent prompts and tool permissions; rollback capability when something goes wrong; and regular audits comparing live behaviour against baseline evaluations.

None of this is particularly glamorous. But the difference between the organisations that succeed with agentic AI and those that stall is infrastructure, governance, and auditability, resolved before deployment and not afterwards.

The opportunity hiding inside the gap

The AgentOps gap is not just a risk management problem, it’s set to become a competitive differentiator too. The organisations that build robust operational frameworks for autonomous agents now will compound that advantage over time. Every governed deployment generates data about what works, where agents fail, and how to improve them. That intelligence cannot be easily bolted on later. Enterprises still running AI agent pilots in 2026 are not behind on AI, they are behind on AgentOps, and that’s a harder gap to close.