What Secure AI Looks Like in the Real World
It’s no longer a theoretical conversation. Most organisations, in some form or another, are now deploying AI systems, whether it’s a machine learning model that helps triage customer tickets, or something more ambitious powering a core product or decision-making tool. The technology is here, and so are the risks.
As cyber security leaders, we don’t have to be data scientists or machine learning engineers. But we do have to understand enough to ask the right questions, influence the right decisions, and collaborate effectively with the teams building and deploying these systems. Security for AI isn’t a bolt-on. It has to be built in, and that means getting involved at every stage of the lifecycle.
A comprehensive framework recently shared within the community sets out four key stages where secure AI development needs attention: secure design, secure development, secure deployment, and secure operation and maintenance. That might sound similar to what we do in traditional application security. And there’s a good reason for that, many of the same principles apply. But AI introduces new attack surfaces, new unknowns, and new types of harm that we’ve only just begun to fully understand.
Designing AI Securely: Shaping the Risk Before It Starts
It’s during the design phase that the trajectory of a system is most easily changed. And yet, this is often where security is least visible. The excitement of the capability tends to take the lead, look what it can do! But that’s exactly where our influence is needed most.
The fundamental question we need to ask early on is: “What is this model capable of doing, and what could go wrong?”
This means going beyond vague worries about bias or data quality and actively mapping risks based on intended function, sensitivity of the data involved, and how the system will be exposed. Will it accept user input? Can it access internal systems? Could it be tricked into disclosing information, or manipulated to behave unpredictably?
This stage is also where the system’s dependencies start to take shape. Are we using pre-trained models? If so, where do they come from? Are there secure practices in place to ensure they haven’t been tampered with, or subtly poisoned to introduce flaws?
And of course, privacy has to be on the table. Not just whether personal data is being used, but how traceable or recoverable it might be from model outputs. Model inversion, membership inference, these are real attack techniques, not science fiction.
Development: Where Familiar Threats Meet New Code Paths
Once the model architecture is chosen and training begins, things start to feel a bit more familiar to those of us in security. Version control, code review, dependency management, patching, these remain critical. But we also need to think differently about what we’re actually building.
AI development isn’t just about code, it’s about data. And that means data integrity becomes a security consideration. If training data is modified, manipulated, or even just subtly biased by external actors, the resulting model can behave in unsafe ways, and those problems are rarely obvious in static testing.
There’s also the issue of interpretability. Unlike traditional applications where you can trace a logical path through code, AI systems, especially large neural networks, behave more like complex ecosystems than engineered machines. That’s not an excuse for leaving them untested. But it does mean that testing needs to include different forms of validation: adversarial robustness checks, evaluation against edge cases, and clear documentation of assumptions and limitations.
A secure development lifecycle for AI must include a structured approach to how models are trained, how they’re validated, and how outputs are evaluated, not just for accuracy, but for resilience against abuse.
Deployment: Trust Doesn’t Stop at Shipping
Too often, security efforts taper off as a system moves into production. But with AI, deployment is when many of the real risks begin. If the model is integrated into a live product or decision system, we need guarantees about who can access it, how it’s monitored, and how its inputs and outputs are controlled.
Access control becomes a layered conversation. Not just “who can call the API”, but “who can fine-tune it, influence the data pipeline, or alter the configuration that controls its behaviour?” This is particularly important if your organisation is deploying foundation models or using third-party inference engines, the attack surface expands dramatically when external services and pre-built components come into play.
Logging and auditability also become essential. If something does go wrong, will you know? Can you reconstruct what inputs were given and how the system responded? These are the sorts of questions incident response teams need answers to in the moment, not after days of forensic digging.
And then there’s deployment hygiene. Is the model containerised? Is it running with minimal permissions? Is the data pipeline secure? These are areas where your existing DevSecOps maturity can be a huge advantage, but only if it’s applied with full awareness of the model’s specific quirks and needs.
Ongoing Operation: Monitoring the Unpredictable
AI systems evolve. Sometimes by design (through continuous learning), sometimes as a side effect of usage (feedback loops, data drift), and sometimes through malicious input. This means that monitoring can’t just be about uptime, it has to be about behaviour.
In traditional software, patching a bug fixes the issue. But in AI, a new pattern in data might subtly skew results. Or a new exploit method might allow prompt injection or model evasion. These are problems that can surface months after deployment, and may not show up until a clever attacker finds the right edge case to exploit.
So the focus here is on establishing baselines for model behaviour and detecting when that behaviour shifts. It’s also about defining thresholds: at what point does performance degradation or anomalous output trigger human review?
Cyber security leaders should also consider how AI systems are updated. Is there a change control process? Can updates be rolled back safely? Are there policies around testing patches before release?
This phase is where collaboration is key. AI and data science teams might own the model’s performance metrics, but security teams have a role in watching for abuse, drift, and unintended consequences.
A Quiet Word of Caution
There’s a growing temptation to treat AI systems as inherently clever and therefore inherently trustworthy. That temptation can be strong, especially when the results look good. But part of our role in cyber security is to carry a bit of scepticism into the conversation.
Just because a model performs well doesn’t mean it’s safe. Just because it’s been fine-tuned doesn’t mean it can’t be tricked. And just because it hasn’t caused a problem yet doesn’t mean it won’t under the right conditions.
This isn’t about being the brakes on innovation, it’s about helping to steer it in the right direction. Our job isn’t to say “no,” but to ask “what if?” and “what then?” at the moments when it’s easiest to assume everything will go to plan.
Final Thoughts: Maturity, Together
What’s clear is that AI security isn’t the job of a single team. It spans product, engineering, data science, compliance, operations and cyber security. But we can lead in the way we always have best: by asking the right questions, framing the risks clearly, and building relationships that put secure thinking at the heart of innovation.
The goal isn’t to scare people into inaction, or to write policies nobody reads. The goal is to help teams build AI systems that are not only powerful, but safe, for the people who use them, and the organisations that depend on them.
And to do that, we have to stay involved. From design to deployment. From prototype to production. From model to mission.
