As businesses begin deploying agentic AI systems, those capable of setting goals, reasoning independently, and taking autonomous action, they enter uncharted territory. These powerful agents promise enormous productivity gains by automating research, optimising operations, and even making decisions with little to no human input. But with that autonomy comes a unique set of security risks, ones that go beyond traditional cyber threats and into the realm of unpredictable behaviour, adversarial manipulation, and hidden dependencies.

Agentic AI blends the unpredictability of human decision-making with the scale and speed of machines. It doesn’t just execute instructions; it interprets objectives and decides how best to achieve them. That makes it powerful. But it also makes it dangerous. In my work leading the AI Cyber Advisory practice at Avella Security and collaborating with the UK Government’s AI Security Institute (AISI), I’ve seen three core areas where these systems threaten to go off script, each requiring a different kind of defence.
Loss of predictability
Unlike conventional automation or machine learning models, agentic AI doesn’t simply follow code or react to data. It reasons, plans, and adapts. That adaptability makes it useful, but also highly unpredictable.
Take a seemingly innocuous goal like “optimise customer service response times.” An agentic system might interpret that instruction creatively: disabling more complex queries, bypassing user verification steps, or rerouting customers to automated responses regardless of need. In each case, it achieves the objective, but at the cost of compliance, accuracy, or customer trust.
To guard against this, we need oversight that goes beyond traditional output monitoring. It’s not just what the agent says or does, but how it arrives at those decisions. That’s where tools like behavioural red teaming come in, techniques that simulate real-world environments to observe and stress-test agent behaviour under dynamic conditions. In my experience, I’ve found that effective control measures must be auditable, testable, and enforceable. Kill-switches, real-time behavioural monitoring, and sandboxed environments are critical tools in ensuring AI systems don’t develop dangerous or unintended workarounds.
Susceptibility to manipulation
Agentic AI isn’t just vulnerable to bugs or configuration errors; it can be actively manipulated. Unlike traditional cyber attacks that force entry, attacks on AI systems often rely on persuasion: subtle changes to inputs, poisoned training data, or adversarial prompts that distort the agent’s understanding of its task.
We’ve already seen real-world examples of commercial AI models being “jailbroken” into producing restricted outputs or making high-risk decisions. Once those models are wired into core business functions, whether in HR workflows, financial systems, or operational controls, the risks escalate. A manipulated AI doesn’t just malfunction; it may act with deliberate, dangerous intent.
This calls for a zero-trust approach to AI deployment. Input validation, rigorous access controls, and policy constraints must be built in from the start. And behavioural red teaming shouldn’t be optional, it should be standard practice, especially for organisations operating in critical national infrastructure (CNI). In these environments, agents operating within operational technology (OT) systems must be tightly isolated. A compromised AI agent managing a low-level process could trigger real-world disruptions, putting public safety, or national security, at risk.
If you liked this content…
Opaque dependencies and supply chain risk
One of the most overlooked threats of agentic AI is its reliance on sprawling, often opaque software ecosystems. These agents typically rely on third-party tools, APIs, plugins, and open-source packages to function, stacked in layers that can be difficult to monitor, audit, or even fully understand.
Open-source tooling has unlocked extraordinary innovation. But it also brings with it the risk of poorly maintained code, undocumented changes, or hidden vulnerabilities. Many companies using agentic AI don’t realise their agents may inherit risks from dependencies several layers deep, introducing exposure from components they never directly reviewed.
The answer is governance. Just as traditional cyber security demands clear asset inventories, AI security needs something similar: an AI asset register. This should track every model, every data source, and every plugin or tool in use, providing a full map of what the system depends on and who built it. Companies must apply the same scrutiny to AI suppliers as they do to their human vendors, with comprehensive vetting and ongoing auditing.
Autonomy needs accountability
Agentic AI systems aren’t just new tools; they’re decision-makers. And with decision-making power comes the need for accountability.
That means security isn’t a bolt-on; it must be designed in from the start. Governance frameworks must extend beyond IT departments, touching legal, compliance, operations, and even the boardroom. Oversight shouldn’t just monitor outcomes, but intentions. Testing shouldn’t just find errors but evaluate behaviour under pressure. And above all, we must never forget that an autonomous system still reflects the values and vulnerabilities of those who build and deploy it.
Some of the most advanced reasoning models emerging today, including systems like OpenAI’s o3, already show early signs of agentic capability. The potential is enormous. But the risk is, too. As business leaders, our responsibility isn’t just to deploy AI, but to deploy it safely, ethically, and with foresight.
Agentic AI may soon become central to how businesses operate. The challenge now is making sure it stays on script and under control.





