Controlling non-deterministic agents

Introduction

The way I would frame this talk is these things are a bit of a trade-off, and what we're trying to do at Porsche AI is provide users with features that allow them to decide where they fall on that trade-off.

Context of Porsche AI

So just as a kind of bit of context, Porsche AI, we're a very early stage startup, just over a year old.

We're based in Kings Cross and we are building an open source framework for helping companies that are in regulated industries build agents for production.

So our focus, because of that target, is on agents that, well, we're focusing on things that are of particular interest to businesses in those kind of areas, right? So things like predictability, visibility, control, authentication, authorization. These things are kind of hard to do with agents. So that's kind of where we're focusing.

Something important

As a very quick general context for this demo, I'm not going to spend much time on this. I want to show the demo.

Core Features of the Framework

Just for context, the core features we have of our framework are really two things. There's upfront explicit planning, which produces this structured plan of what the agent is going to do. Kind of differentiating it from other agent frameworks out there, which you might just give a prompt and they'll just go off and you don't really know what's going to happen.

And then we have execution agents which take that plan and execute it and run through the steps.

Deciding Between Agency and Control

So we're often thinking about this trade-off between agency and control. And one of our goals, like I mentioned, is just providing features that allow you to decide where you fall on that spectrum.

Depending on your use case, you might want something that's very agentic and has total freedom to do what you want. Or you might want something that's very locked down, more like a kind of a workflow which maybe has some like LLM driven steps in it. That's totally fine as well. And that's really appropriate for some use cases.

So hopefully the example that I'm going to show you sort of demonstrates like a couple of different ways that you can cut this to explore that trade-off. And then I'll talk very briefly at the end about other ones.

Something else

OK, so the example I'm going to show you is a refund agent.

So it's very simple.

Refund Agent Use Case

The idea is that you get emails into an inbox that contain user refund requests. And you have your refund policy.

This has never changed. Companies have that on their website. That exists already.

You can provide that to your agents to help them decide whether stuff should be refunded.

So you've got this reasoning agent, I've called it here.

And then you can go forward and get a customer service representative to approve that, or reject maybe. If they approve, you can go on to process the refund.

Ensuring Human Oversight

So the key point here for this particular use case where you might want to think about control is making sure that a human is checked with before they refund, like the agent refunds something, because you don't want your agent going like wild with your Stripe account, just like issuing refunds left, right and center because someone like prompt injected your email inbox, right?

Okay, so I think I will start.

Execution Approaches

I'll just show you our first approach to this.

So our first approach, I mean, like many agent frameworks, we have a concept of tools. The tools are the things that actually do the actions, and the agent is basically our planner up front.

We'll decide which tools are going to be used as part of this to solve this problem, and then go through and execute those tools, providing the right inputs at the right time.

Using Agent Tools

So 1the first approach we have is a tool which explicitly asks for human input. So in our framework we have a concept, something called a clarification,

So clarification is a structured object which can be returned from an agent or from a tool which basically pauses the flow of the agent and then requires a human input before it will continue.

Actually, it's funny that Joshua mentioned MCP because that's quite, I'm using MCP actually in this talk as well. But MCP, if anyone's using that or following that very closely, they released a new spec recently

where they've introduced a concept of an elicitation, which they basically copied off of us. So it's the same concept. It's something which asks the user for input to continue on with the flow.

I don't think that anything's really been released around it yet, but it's kind of coming.

So... So yeah, so first example we implemented with a tool.

So the tool checks, like you can insert this into, or the ape, like I just said, the planner will insert this into the flow. And this will check, and this tool can be called before the refund is issued.

So the code bit, this is what the tool looks like. We've actually got up here as well an argument schema, so like Pydantic, model.

So this tool will get the refund request, the summary, which is what the agent thinks should happen, and then the human decision. And just to jump to the relevant bit, the run method has this multiple choice clarification that's returned.

So if there is no human decision, We want to return a multiple choice clarification where the options are to approve or reject the refund.

And then we have other tools. So I defined a tool for the refund reviewer.

This is the tool that actually, that's the reasoning agent in that diagram that I showed you. And then, yeah.

Then we have our Portia kind of set up. that will execute this.

So if I actually go back to the previous slide, well, yeah, to here, we're using, actually the original reason I did this demo is because we wanted to demo MCP, but I won't talk too much about MCP this time. We were using the Stripe MCP server, so that's the thing that just wraps all the, that provides us with a bunch of tools out of the box to integrate the payment stuff, so we don't have to write that stuff ourselves, which was great.

And I think I will show you the execution.

So this is now very small. Can people read that? Yeah? Okay.

So first example.

So I'll just show you actually quickly the refund policy.

So the company is a hoverboard company. They sell genuine, real-life hoverboards, like from Back to the Future.

And I generated a reasonable refund policy for a company like that. But as you can imagine, that's something that's quite expensive and you want to check before refunding someone. a million dollars or whatever that would cost.

So first example, I'm going to have a request from Marty at Porsche Labs who says, I bought a hoverboard last week and it's not working. Can I get a refund?

So that's a pretty reasonable refund request. Hopefully the agent goes through and approves that. Although you never know.

So I can just talk through things that are happening as they're happening. So first what's happened here is the planning agent ran, and it's generated these steps here. So the steps down here, we've got the refund reviewer tool.

Great. That's what we wanted to check first. Like, is this a reasonable thing in the context of that refund policy? That produces an output, which is whether the agent thinks that we should refund, which then goes to the next step, which is the human approval tool step.

So that's that tool that I showed you the code for. This is where we would expect to get that clarification where I'm asked for input on whether I approve of this. And then if I do approve, then we've got, you can see these are the Stripe tools.

So we've got like this Stripe MCP list customers tool, list payment intents tool, refund tool. So if you've used Stripe before, that'll make sense to you. If you haven't, don't worry about it.

And I'm being asked if I want to approve this plan. So this is another mechanism of control we have. Because we've generated a structure plan up front, a user can decide whether they want to execute that or not. I can abort out of here if I don't like it. But I do, so I'm going to go ahead.

So we're now going to go executing the steps one at a time. OK, so it's gone through. It's checked, and it thinks that it should be approved. And it's showing me this suggestion here, which is that you should approve the refund. And then it's given me its reasoning.

So this is great context for the customer service representative. It's telling them what the reasoning is and how that stacks up against their policy. So it should be approved. So I am going to agree with that. I'm going to type approved. And the agent then gets that response, and it can resume its work.

So if all goes well. It will then go through to start integrating with Stripe, running the Stripe tools, and issuing the refund. So it's just checking that I did approve it. OK, great. So that's going ahead now.

So we can see here, so there's a lot of text. This is a developer-focused product, you can probably tell. It's gone through, it's listing the customers now. And in the flow, I know now that there's no things that it needs to interrupt me for before the end. So it should complete.

OK, so it's invoking the create refund for $1,000. Who knows what currency that is? And it was approved. Oh, it's 10 pounds.

That's a cheap hoverboard. So it's reviewed and approved based on the policy within 30 days, claim defect, et cetera. Great. So that's all good.

But there are a few issues with this approach. So one thing is LLMs are just non-deterministic.

There is a chance that it won't include my tool in the plan before it issues the refund. And yes, we're checking the plan before we're approving it, but if I'm a customer service representative, I've seen a million plans, I might just click yes anyway and not really check.

So you can't be sure that that tool is actually going to be called.

And that gets especially bad when maybe you've got 100 other tools in your context. It might get confused between tools. It's just a recipe for mistakes.

You need to prompt engineer things, so you need to prompt engineer your tool. The way you prompt engineer it might need to change over time as you add other things into the context.

It can be difficult to do, and we don't really want our customers to be spending their entire time prompt engineering. That's something we view as we should be making easy for them if we can.

You need to create a new tool every time you have a different use case. So maybe it's not refunds, but it's something else. Now I need a new tool with new prompt engineering.

I have to go through that flow all over again. And then there is LLM overhead as well. It might not seem like a big deal, and maybe it isn't for this use case, but some of our early customers are very latency sensitive, mainly from just a product experience point of view.

If you want this to be happening in real time for the customer, I mean, you just saw the example, and I've got some caching which sped that up. But in real life, that would take around two minutes to run. And if you're just sitting there at a chat interface, that's kind of bad.

Execution Hooks

So 1I want to talk about another approach we have for control, which is execution hooks. So for non-developers, I guess, like, I guess most developers will be familiar with execution hooks because lots of frameworks use them as a way of extending the framework.

But basically an execution hook is a way that developers can register handlers for events which allow them to say, like, when this event happens, I want to take this action before things continue.

So we added execution hooks after I created this initial demo, and then I went back to it and I said, like, okay, well, I want to do this again, but I want to do it with execution hooks.

We provide execution hooks for various parts of that process, so there is an execution hook that you can register after the plan is generated. Maybe you want to validate the plan.

you can have execution hooks that run before and after the step, before and after the tool call, and that provided me a place to really implement something useful. So I can say, before a tool is called, I can check what tool is being called, and if I want to, I can raise a clarification at that point.

So if I show the example here, And this is really the relevant part.

The rest of the code is pretty much the same. Like, we don't have that tool anymore.

But I have a before tool call hook, which we actually... I mean, you could write your own functions here, but we've provided this by default in the SDK because it's so useful.

Clarify on tool calls where you say, before a refund is issued... call this hook, like no matter what. So I know now that if the agent decides to issue the refund, that I will definitely get a clarification raised.

So this is like where we're going away from something which is a bit non-deterministic and we're, I mean, really we're just falling back on like traditional application development techniques to like make sure that like, the behavior that we want is implemented.

And for this particular use case, this is the behavior we want, right? We definitely want to make sure that a human is checked before.

So let's run this again. It's pretty much the same. Maybe not that interesting to look at it at the same time, and you can't really tell what's going on under the hood. You just kind of have to trust me that it's an execution hook this time.

I guess what I can do this time is reject the request to show that that's a possibility as well. I guess just to show this briefly, we have the plan. Now we don't have a tool call.

There's nothing here that says, check with the human because this is just something that's part of this framework now, right? I can see that the createRefundTool is there, so I know that my execution hook is going to happen.

So if I kick that off, all the previous steps are going to be the same.

And I guess while that's going, and this is just one example, right, of the trade-off I'm talking about between something that's non-deterministic and something that's strictly controlled. Or you may want to have strict control. So that's where you can explore that trade-off.

You might want to have a tool call. You might want this to be something that the agent decides if it is going to do. That's certain other use cases that will be exactly what you want. Then the tool calling way of doing this would be the way to go.

You may not, you know, maybe it's not a required input or you just want maybe the agent to request from the human some help or some additional context, which might be optional. Then the planning agent can dynamically decide at runtime that it's going to check with the human at that point.

Where have we got to? Okay. So we've got the clarification.

Looks a little bit different, but it's pretty much the same thing. So I'm going to say no this time. It's the same request, so I don't know why I would say no.

But there we go. So it's user rejected the tool call to the tool name. Not rendered correctly.

But you get the idea. This is executed early, or exited early, rather. It's not issued the refund to the user.

Conclusion

So other mechanisms for control. This is... I talked about the trade-off between different ways of eliciting human feedback.

Future Areas of Development

Other things that we are thinking about, one is evals. Obviously, the default answer to any question is really evals.

So for testing and observability of use cases at businesses, they can get confident by having a bunch of examples and just validating in different use cases that it does what they want. So we have an evals framework to help with that.

Previous plans for few-shot learning. So, few-shot learning is basically for, I mean, in the context of this, it's including plans that you liked from the past into the context, so that the planning agent can draw on those examples. And LLMs love examples, so you give some good examples, like, you can be pretty sure that it's going to draw from those, maybe too much sometimes.

But that's really helpful if you're like, this is how I would expect it to go. You have freedom to change a few things around the edges, but this is generally the flow that I want. And that pairs quite nicely with an eval's framework as well.

And an introspection agent, which is something that runs... So it's another agent I haven't really talked about. I talked about the planning agent and the execution agent, which are really the two main ones. The introspection agent is another one that runs at every step and just monitors that flow. and you can plug something of your own in there to decide, give it your own prompts and things to decide on checking some other things that are going on at the same time.

So yeah, so you can play around with all of these different concepts and find that trade-off that really works for your use case to go either really hard on the end of control or just completely YOLO mode, let the agent do whatever it wants.

Final Thoughts

Yeah, that's the end of my talk.

I've just put a couple of links here. So one is the SDK if anyone wants to look at how these concepts are implemented. It's open source, so you can have a look at it.

All the good stuff is open source. And the Portia agent examples repo, you can find this example in there. So you can have a look there, and there's a couple of other ones as well if you're interested.

And there's my email address as well.

So that's my talk.