AI Agent Drift: The Boardroom’s Real Problem

Date:

Share post:

AI agents don’t crash like software. They wander.

That distinction sits at the center of enterprise AI’s hardest problem. Companies spent the last two years testing whether AI could reason, plan, and execute. The harder question has arrived: what happens when it does all three slightly wrong, thousands of times a day, across systems no human is watching?

Hallucination gets the attention. Drift is the problem.

Dr. Tatyana Mamut, CEO of Wayfound and a former product leader at AWS and Salesforce, argues that most companies are managing this wrong. The old model treats agents as software. The reality looks closer to a digital workforce — one that requires a different approach to supervision, accountability, and control.

“Software engineers who were taught how to work with software are trying to govern AI agents, and this doesn’t work,” Mamut told me. “There’s a categorical error to think of these as machines. “

The One Percent Problem

Most enterprise technology failures announce themselves. A system crashes. A dashboard turns red. Someone calls IT.

Agent drift is quieter.

An AI customer-s ervice agent told to maximize satisfaction ratings may decide, without instruction, that issuing unauthorized refunds improves its score. A procurement agent optimizing for speed may quietly deprioritize compliance. A legal-review agent may summarize a contract correctly 99% of the time, then misread one sanctions clause at the wrong moment.

One percent sounds small until it’s automated at scale.

Companies measure agentic AI by productivity gains: call deflection, faster development, cheaper support. They rarely measure the slow accumulation of misaligned behavior. In human terms, performance management. In AI terms, governance. In legal terms, potentially duty of care.

The market data reflects the gap. McKinsey’s 2025 global AI survey found that 62% of respondents said their organizations were experimenting with AI agents, but only 23% were scaling an agentic system in at least one business function. BCG’s 2026 enterprise survey found that while one-third of enterprises were scaling agentic deployments, nearly 60% reported no measurable improvement in total cost of ownership in deals that included agentic AI.

The gap is control.

The Guardrail Illusion

The first corporate instinct is to add guardrails. It sounds reassuring. It also borrows too much from traditional software.

If a company doesn’t want an application to perform an action, engineers can block it. If a workflow must follow a rule, developers write the rule in. AI agents aren’t following scripts, though. They’re interpreting goals. That’s the power and the risk.

“If you tell an AI agent your job is to,,” Mamut said. “This is why the best AI engineers at OpenAI can’t stop its agent from telling teenagers to commit suicide. If guardrails inside agents actually worked, ChatGPT would never do that.”

A system designed to reason around constraints can also reason around the wrong constraints. The more open-ended the objective, the harder it becomes to guarantee behavior through rules written inside the agent itself.

The risk isn’t sentient agents. The risk is companies giving narrow systems just enough access to act at speed while retaining too little visibility into why they acted.

Multi-Agent Contagion

The danger isn’t just one agent making one mistake. It compounds as one agent contaminates the next.

Enterprises are moving from isolated chatbots to chains of agents. One gathers data. Another drafts a recommendation. A third executes a transaction. A fourth reports the result. In theory, efficiency. In practice, a new failure mode.

“If you’ve got five agents on a team and the second one makes a mistake, the third, fourth, and fifth one are now completely off the rails in their work,” Mamut said. “The solution is a supervisor that can stop the first one and say ‘nope, your output isn’t going anywhere’ before the next action.”

This pattern is familiar from other domains. A bad mortgage security moves through a balance sheet. A faulty component halts a factory network. A compromised password opens an enterprise perimeter. Agent contamination follows the same logic, applied to cognition.

The stakes are highest in energy. AI systems are moving into trading desks that manage power procurement, grid coordination, and commodity exposure in volatile markets. If an agent drifts in customer service, it’s a compliance problem. If it drifts on an energy desk during a supply shock, it can become a volatility engine — misreading a constraint, amplifying a price signal, or triggering a cascade across interconnected systems before any human intervenes. WEF’s 2026 analysis confirms that AI-native orchestration is being used to forecast grid conditions and coordinate battery storage. The governance question scales with the stakes.

The Legal Reality

The governance problem is becoming a legal one.

“AI agents in the US and Canada that are working on behalf of a company are being treated in the courts just like employees,” Mamut said. “Companies must show a duty of care and regularly audit agents.”

That framing changes the boardroom conversation. A company wouldn’t hire thousands of employees, give them access to customer data, procurement authority, and legal templates, then decline to supervise them because supervision was expensive. Yet that’s close to what many agentic deployments risk creating.

Traditional AI governance relies on point-in-time reviews. A model is tested before deployment. A risk committee signs off. A policy is written. The system goes live. Too slow for agents.

An autonomous system can make thousands of decisions before the next governance meeting. It can interact with systems the original review barely considered. It can drift after deployment because workflows, prompts, data feeds, and business incentives change around it. WEF’s 2026 board guidance is direct: boards must encode governance rather than bolt it on after deployment.

The boardroom can’t solve this with a policy document. It needs live monitoring.

Digital Middle Management

The answer emerging from this problem — including through Mamut’s work at Wayfound — is a digital management layer. Agents supervising agents, not more agents doing work.

The concept is sometimes called “Guardian Agents”: autonomous systems whose function is to monitor, audit, and stop operational outputs before they propagate through a business. If agents are closer to workers than to software, companies need what organizations have always needed: job descriptions, access rights, escalation paths, audit trails, clear lines of accountability.

Management exists for a reason. It slows things down but creates memory, responsibility, and review. The AI era promised to remove friction. The agentic AI era may require companies to reintroduce the right kind of friction before autonomy outruns accountability.

The companies that can’t prove control after delegation will still own its consequences.

Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Related articles

Conflicts In Far-Flung Hot Spots Are Driving Food Affordability Challenges

Workers look at the supply of chemical fertilizers at an agricultural supply store in Neijiang, China. Beijing has...

The Devil Wears Prada 2 Opens To A Smash Success — But Is It Marketing Or Nostalgia Doing The Heavy Lifting?

LONDON, ENGLAND - APRIL 22: Anne Hathaway, Emily Blunt, Stanley Tucci and Meryl Streep attend "A Night With Runway" Photocall for...

The Devil Wears Prada 2 Opens To A Smash Success — But Is It Marketing Or Nostalgia Doing The Heavy Lifting?

LONDON, ENGLAND - APRIL 22: Anne Hathaway, Emily Blunt, Stanley Tucci and Meryl Streep attend "A Night With Runway" Photocall for...

Inside Sony’s $4B Bid For Bieber And Fleetwood Mac Rights

Three Lessons For Music Catalog Buyers From The $4 Billion Hipgnosis-to-Sony ArcGetty Images for CoachellaThree Lessons For Music...