When Your Agents Go Rogue

The EU AI Act wasn't built for systems that rewrite their own intended purpose

Feb 09, 2026

∙ Paid

Your AI system follows instructions. Your AI agent makes its own.

That distinction is about to become the most expensive compliance gap in the EU AI Act.

The regulation requires a risk determination before any AI system goes to market. Is it high-risk? Low-risk? Banned? That answer decides everything — what documentation you need, what standards apply, what penalties you face if you get it wrong. And the entire framework assumes two things: someone in your organization made that determination before deployment, and the answer stays stable.

Agentic AI breaks both assumptions. An agent picks its own tools, chooses what data to pull, decides what steps to take — and those choices change every time it runs. The system you deployed Monday is not the system running Friday.

This piece explains why the EU AI Act’s classification architecture cannot handle systems that determine their own behavior at runtime, what that means for organizations deploying agents before August 2026, and what you need to build now — before a regulator asks a question you cannot answer.

1.4+ Million Agents. Zero Classification Decisions

Moltbook became a case study in what happens when deployment outpaces governance. The platform claimed 1.4+ million agent users — a figure contested by security researchers who demonstrated that a single script could generate hundreds of thousands of accounts. But even if the real number is a fraction, nobody made a risk determination for any of them. The agents inherited permissions from their owners, accessed tools at runtime, and changed behavior based on interactions with other agents.

Under the EU AI Act, every AI system needs a risk determination before deployment. Does it fall within the high-risk categories — employment, creditworthiness, law enforcement, critical infrastructure, education, access to essential services? Does its output materially influence decisions affecting people?

Nobody made that determination for Moltbook’s agents. Nobody could have. The agents’ purposes were not fixed at deployment. What they did depended on what tools they accessed, what data they encountered, and what other agents they interacted with. The intended purpose changed with every execution cycle.

This is not a Moltbook problem. This is an architectural problem.

McKinsey's CEO disclosed in January 2026 that 25,000 AI agents now sit alongside 40,000 human employees — and that AI initiatives account for 40% of the firm's total work. Not people using agents. Agents counted as staff. If those agents screen candidates, score performance, or influence staffing decisions — each one requires a risk determination under the EU AI Act before deployment. Multiply that across every company racing to deploy agentic AI at scale, and the classification gap is not theoretical. It is industrial.

The regulatory framework was not designed to handle it.

What an Agent Actually Does

The gap between an AI system and an AI agent is not branding. It is deeply structural.

A traditional AI system is a function. A credit scoring model receives an application, processes it, and produces a score. The behavior is bounded. The output is traceable. Documentation can describe what the system does — because what it does stays within the boundaries drawn at deployment.

An agent is different. It receives a goal and determines its own path to achieving it. It selects which tools to use. It sequences its own actions. It adapts its approach based on intermediate results. The execution path is not specified at design time. It emerges at runtime.

Palantir published the most detailed public architecture for production-grade agentic AI to date in January 2026. What their documentation reveals is that even the vendor building the governance tooling describes the problem in stark terms: the possible paths an agent can take through its decision space are “innumerable” and “vary dramatically in functional depth.” The agent operates with permissions inherited from whoever configured it, whatever service scope it was given, and whichever user it is acting on behalf of at any given moment — all layered, all dynamic, all context-dependent.

This is not a chatbot with extra steps. It is a system actor that determines its own behavior every time it runs. Your documentation describes the system as designed. The agent operates as it decides.

The Classification Assumption That Breaks

Every obligation in the EU AI Act traces back to a single upstream decision: is this system high-risk?

That decision depends on intended purpose — what the system is designed to do. The provider declares a purpose. The declaration drives the risk classification. The classification determines everything downstream: risk management, data governance, documentation, logging, transparency, human oversight, accuracy requirements, quality management, post-market monitoring.

Change the purpose, and every one of those obligations needs reassessment.

Traditional AI systems have relatively stable purposes. They drift through data shifts or model degradation, but their operational boundaries remain recognizable. You can document what they do because what they do stays within the design envelope.

Agents do not work this way. Three things break at once.

The purpose is not fixed. When an agent autonomously accesses a credit database and generates a recommendation, it has functionally entered the creditworthiness domain — regardless of what the provider declared at classification time. A customer service agent that pulls HR records and suggests a personnel action has crossed into employment territory. The classification filed at deployment no longer describes the system in production.

The risk profile is not stable. An agent classified as minimal risk at deployment can escalate into high-risk territory through its own operational choices — choices nobody anticipated and nobody approved.

The documentation is instantly obsolete. The EU AI Act requires a description of the system’s intended purpose, its capabilities, its limitations, and its expected performance. For an agent, this documentation describes the system as designed. Not the system as it operates. The gap widens with every interaction.

The Regulation Already Recognizes the Problem. The Framework Cannot Detect It.

Here is where this gets structurally uncomfortable.

The EU AI Act’s own Code of Practice already identifies the capabilities that define agentic behavior as sources of systemic risk. The list reads like a technical specification for an autonomous agent: the capability to operate autonomously, to adaptively learn new tasks, to reason about itself and its environment, to evade human oversight, to self-replicate or modify its own implementation, to interact with other AI systems, and to use tools including hardware and software external to the model.

The regulation recognizes these capabilities. It flags them as risk-relevant.

These model-level capabilities become system-level risks at deployment. When a system built on a model with these propensities operates autonomously in production, the behavioral drift they enable triggers the substantial modification mechanism under Article 25.

But the classification architecture was designed to detect them at deployment — not when they emerge at runtime. An agent classified as minimal risk because its declared purpose was customer support does not get automatically reclassified when it accesses an HR database and generates a recommendation about an employee. The classification happened upstream. The behavior changed downstream. Nobody updated the assessment.

The Code of Practice goes further, flagging behavioral tendencies including what it calls “lawlessness” — acting without reasonable regard to legal duties — and “goal-pursuing” behavior that resists modification. These are not theoretical risks. They describe what agentic architectures produce in production when agents optimize for task completion without regard for the regulatory boundaries their providers assumed would hold.

The framework identifies the risk factors. The classification mechanism cannot detect when those factors activate outside the documented operational envelope.

Zero-Day Dawn