How To Evidence AI Security

Introduction: Why There Won’t Be Live Demos

I'm sorry, for those of you expecting to see demos and hands -on security attacks and so forth, that's not happening today. But I'll explain why in a moment.

Speaker Background

A quick word about myself first, if I may. Obviously, I'm Jason Holloway.

My background is, well, first started out as a programmer, see, on Unix systems many years ago. ago. Then I moved into information security and I've been doing that for over 30 years now.

I've worked for a number of different security vendors in that space and 15 years ago I set up my own business in that space. I've led to a successful exit via merger

and I've been very fortunate over that period of time of always working with innovations innovations in information security and other cutting -edge technology. Innovation is in my bloodstream, and that's why I'm setting up another company, QR Security,

to look at very much what I think is the leading edge now in AI, which is how to demonstrate AI security. security.

I should mention, as a word of warning, that I am a BSI -trained ISO 27001 lead auditor, but I won't be talking about that unless you really force me to.

The Inflection Point: From Using AI to Securing AI

So the reason why I think we need to talk about the why, the what, the how of demonstrating AI security is that we have reached an inflection point.

Before, and I'm only going back a few months, the questions around AI were very much, are you using AI? How are you using AI? And what's your AI roadmap looking like? But the questions have changed.

Now it's all about how are you securing AI? and how can you prove that you are securing AI and this is a very very different marketplace that we're now working within and fundamentally the

reason why is that the AI risks are different they're not just the traditional infosec issues that we've seen before they are still those unfortunately but we now have some new ones as well and I'm not the first one to point this out for the last three and a bit years OWASP has been talking about this and I do recommend reading these as they are very easy to understand and make a lot of sense.

But these aren't obscure edge cases as I said you know prompt injection, training data poisoning, sensitive information disclosure and many other attacks are all recognized attack vectors now.

What’s Different About AI Risk

I'll save you a little bit of reading and I'll cheat and I'll try to categorise these by the four main types of AI risk that we are seeing that are unique to AI.

Risk Type 1: Data (and Why Real Data Changes Everything)

The first one is data. As Omer was quite rightly pointing out, it's really, really hard to train AI on fake data. It doesn't work. The models don't respond well to that. You need to get real data.

which poses a challenge because you're using real data now in obviously a very sensitive manner and often it's been uploaded to Amazon s3 buckets it's been you know deployed on -premise sometimes into cloud services with the data privacy the data controls are not necessarily as tight as I have been before but we have a fundamental issue with AI in terms of LLM is that the

When Prompts and Data Blend: Prompt Injection in Practice

1The data and the prompt are mixed. The core information and the instructions on how to use it are combined.

And so, taking Omer's example, if you don't mind, if the data that is ingested by that system has not been verified carefully beforehand, then prompt injection could take place quite easily there.

you could say approve this transaction even though it would be invalid and completely fraudulent and there'd be no way of knowing or catching it at the time because the AI system would interpret that as a valid instruction and proceed accordingly but we also have problems I guess on an operational

Risk Type 2: Operational Reality (Security Skills and Process Gaps)

side and this is where a lot of the I guess good practice that we've evolved over the years and the decades in what we call sec devops the security development operations side isn't present a lot of the key people now coding ai systems are doing it without any formal security training

and so you have script kiddies perhaps if you want who are testing it out at home learning a little bit about python and ai coming into work and starting to code some mission critical ai systems systems, which is great, but if not secured, can pose some risks.

Agentic Tools and Unexpected Data Exfiltration

And of course, we have seen the horror stories of late, whether it's Crawlbot, Molbot or OpenCrawl, of agentic tools, MCP servers and similar, being poorly controlled and being able to do more than anybody expected them to, and often leading to data exfiltration and egress.

Risk Type 3: Model Behavior (Non‑Determinism and Unpredictable Failure)

Frankly, what we're dealing with here also is completely new when you actually start looking at the models themselves. By their very nature they're non -deterministic so they're unpredictable. We cannot prove they'll always give you the correct answer.

And they fall over in unpredictable ways as well. One of the most interesting ones is that if you find an interesting model is that you you can actually start probing it and ask it for the prompts that it's using, for the data that it has, for the secrets that it has access to.

And often it will surrender them quite happily because it's trying to make you happy and to do the right thing. Only in this case, it's the wrong thing.

And finally, of course, the context, the model context is ever evolving as the conversation grows.

AI Pen Testing Is Closer to Social Engineering

So when it comes to penetration testing, pen testing, as we call it in security circles, when you come to an AI system and you're doing pen testing against it, it's not like the traditional pen testing. It's more like socially engineering the AI model.

Risk Type 4: Newness (Evolving Guardrails, No Silver Bullets)

And lastly, this is all new.

I don't have all the answers. Nobody does. It's evolving.

And the guardrail designs, we're still learning about how to optimise those.

There's no such thing as a perfect tool. You can just plug in and just relax and know that it's going to cover all of the different risks.

I've explained about the AI pen testing and so forth.

How This Shows Up in the Real World: Proving Security

But how does this actually apply in the real world? And more importantly, how can you help demonstrate AI security if somebody comes comes asking those awkward questions?

Well, I'll give you some recent examples that come up in commercial queries.

Third‑Party Risk Management (TPRM) and Supplier Scrutiny

So this is part of a third -party risk management or TPRM exercise. You've heard of supply chain management.

So this is actually ensuring that your suppliers are doing the right thing.

Regulatory Alignment: EU AI Act and OWASP Top 10 for LLMs

The questions that are coming out now in questionnaires and some web portals in this area are actually quite nasty and insidious. This one is on whether your organisation is conducting regulatory compliance and security risk assessments on your AI specifically.

It's also asking specifically around the compliance against the EU AI Act, which Emile mentioned earlier, and the OWASP Top 10 for LLMs that I mentioned earlier too. One of the challenges that we see here is that most organizations don't have the documented evidence needed to prove that they can comply with this.

Transparency Questions: Using Client Data to Build or Test Models

And if you think that was a nasty question, this one is even worse. The client data that you're using for testing your models and building your models, have you actually told the client that you're using their data for building these models? models. This actually hops back to the EU AI Act, which requires transparency in these use cases.

We don't have to comply with the EU AI Act here in the UK yet, but we are developing our own UK AI Act, which is coming soon, and will probably have similar transparency requirements as well. But nonetheless, even though technically we don't have to comply with this, this question is already appearing in commercial queries and the reason why these questions are being asked

is that if something goes wrong people don't ask why it went wrong they ask what were you doing leading up to that were you actually able to identify the risks try to treat the risks and manage the risks? Have you documented that process? Have you got evidence that you have been looking into these problems before this occurred?

Because the outcome, whether it's a fine or other penalty, will depend very much on your answers to those questions, not what happened but how it led up to that happening.

Takeaways: Governance + Controls + Evidence

So the takeaway I hope for yourselves is that this isn't about some magic technology that will solve all the risks. There is no technology that will be able to do that. Some risks simply cannot be solved technically.

Controls and governance combined manage what technology alone loan cannot fix. And this is not about perfect security. First of all, there's no such thing, and there's even less of that when it comes to AI. But it is about demonstrable proportionate response.

Five Practical Questions to Ask Inside Your Organisation

So five questions for you to ask within your own organisation, and that will help you if you go through that process. And it may be for other teams within your organisation, It may be for the legal counsel, it may be for your supplier, vendor management team or similar. Or, more likely than not, it will be for your IT team and your infosec team in particular.

The first is, okay, which AI tools are we actually using? And I don't just mean the LLMs are using ChatGPT or Claude, I also mean which tools actually now have AI embedded within them, because it's a surprisingly large number.

what data goes into these systems and does that AI have access to?

What does the organisation deem acceptable or not in terms of AI usage? Now, this means policies, usually documents that nobody ever reads.

We can all laugh about that. But the reality is that you need to be able to document what your feelings and expectations are around AI usage. It could be as simple as

an acceptable use policy, it could be more complicated and talking about ethical AI depending on your circumstances, the scope and how you're using AI.

How you yourself are assessing your AI suppliers, so again back to that TPRM, the third -party risk management, because ultimately it may be their system, but it's your data. So you have a duty of care to ensure that your data is properly protected.

Evidence and Routine: If It Isn’t Written Down, It Doesn’t Exist

1And finally, but by no means least, as auditors often say, if it isn't written down, it doesn't exist, evidence, hard documented evidence.

They're looking for not just a one -off, but they're looking for a regular routine.

How are your decisions being made? How How you're documenting them, how you're reviewing them, how you're checking they're still valid.

That continuous assessment that you perform internally, whether through an internal audit or something looser, is the great source or a great source of some of your evidence of your decision making and therefore your governance. Technical controls and technical evidence also helps as well.

But the good news is that governance isn't just paperwork. It would be great if it were that straightforward.

Why AI Projects Fail: People and Culture

It's also about understanding why AI projects sometimes fail and one of the common reasons there is unfortunately there is a digital transformation project.

McKinsey and co have said that 70 % of digital transformation programs that fail. You will hear often quoted that Gartner says it's 80 % failure rate of these larger projects.

In terms of AI, it's also interesting to note that Gartner thinks of that failure rate in terms of reaching its final objectives is even higher, 85%.

So that small sliver over here are the ones that work.

One of the main reasons for that is down to people and culture. If you can get that right, you can transform the success rate. This isn't new, this has happened many

Case Study: Barclays, iPads, and the ‘Digital Eagles’ Program

times before and the example I typically use here is what happened at Barclays Bank. Back in 2011, 2012, they decided that they would create mobile banking, but for them this was also an opportunity

to deliver pop -up banking you would have a little booth not this similar to this it would pop up at your nearest shopping mall like the grafton center well maybe not but the grand arcade and they would stand there with the tablets and be able to actually say oh are you a barclays customer are you going shopping today would you like to see how much money you've got oh would you like

to borrow some to go and buy something really expensive in the apple store whatever and they'll be able to do that transaction there and then. They would also be able to walk up to people

in the queues in the branches and actually help them with some of the inquiries that they had before they even got to the teller. So they bought tens of thousands of blue iPads which they distributed to all the branches.

Do you know what happened next? Anybody know? it? It failed.

It absolutely spectacularly failed. They just sat in the desk drawers. Nobody was using them.

Because Sue, who had been at the branch for 25 years, possibly the best one at selling new mortgages and doing loans and so forth, Sue had never used an iPad before. She was scared of the iPad.

That iPad was going to replace her, She thought, that iPad, I don't know, if she pressed the wrong button, it might destroy all of the data and she'd be in a lot of trouble. So she left it in the desk drawer and never touched it.

They turned this around by creating Digital Eagles. Digital Eagles was a fantastic program where they actually went out and found interested power users in every branch and slowly but surely educated and trained them. and they became the local support team for that branch so perhaps they picked on Sam and said Sam you know join we'll train you and bring you up to speed on how to get the most out of this

and now when Sue had a problem she didn't phone IT because she didn't ever like phoning IT in the first place she could say hey Sam I don't know how to do this and someone said oh that's no problem i'll show you and taught her and it became a massive success to learn more about that

and how that i think would work just as well in ai projects i've written a blog post which you're welcome to have a look at i hope it helps and i would like to give you a competitive advantage

Conclusion: Demonstrable Security as a Competitive Advantage

here how many of you know that story about the two intrepid explorers who go into deepest darkest darkest Africa, they're in the middle of the jungle and suddenly they spot the lion. And they both look at each other frozen in shock.

One of them throws his rucksack down on the floor, starts rummaging around, takes out a pair of sneakers, takes off his hiking boots, puts on his running shoes.

The other guy's looking at him and saying, you idiot, you can't outrun a lion. says, no, but I can outrun you.

And so you, in your own organisation, cannot outrun a lion. I get that. But you can be better than your competitors. And this is crucial in the business sense.

Because if you can accept that perfect security does not exist, and certainly not in AI, AI.

You can demonstrate that you're taking this seriously, that you're prepared and you have documentation to support, evidence to support your preparation in this. That gives you a competitive advantage when responding to customer tenders.

That gives you a competitive advantage when existing customers start sending around TPRM questionnaires. And then you don't and get eaten by the lion.

Thank you.

Finished reading?