Deconstructing AI Agents: How to Build Workflows that are 3x Cheaper and More Reliable

Introduction

So hi everyone, I'm Guillaume. It's a lot of pressure to be the favorite speaker of Jasper, but I think all three interventions will be amazing.

Just one word about myself, I'm currently working as a growth manager at Softr, it's a no -code platform to build business apps. apps. I also create some content on YouTube and LinkedIn about AI app building.

And I also launched like an AI app called Lukio recently that's like doing RAG, a bit like Notebook LM, but more API first.

And yeah, I wanted to come here to talk a bit about efficiency for AI orchestration and maybe to get started.

Has any one of you already looked into how you know agents call these tools that they have or how an MCP actually works in in the details anyone has checked yeah two three four four

Workflows vs. Agents: A Quick Benchmark

people and yeah maybe to to introduce so a platform I'm using a lot is NA10 you might be familiar it's really this automation platform where you design you you map a bit your different nodes that can be there in each like an api call or using any app

or doing ai and here i have two options that actually do the same the exact same thing the first option is an agent that has the like what it has to do as tools and the second one is an ai workflow do you have an idea of the one that's actually faster and the one that's cheaper any Any guess?

I know workflows are faster and cheaper. Yeah, it's actually the case. Otherwise, that wouldn't be the title of my presentation.

But I run the tests. They do exactly the same thing.

What the Demo Workflow Does

Basically, the workflow is about, we take a database in software databases, so a bit like air table.

We list all the tables, so we list all the schema of this table and then table per table, table we get all the all the values and basically we have the full schema of the database and then

Measured Results: Faster and Cheaper

we ask an ai to summarize it and to explain what this database could possibly be about and yeah in the second case we're almost twice faster and nine times cheaper and this is related

to the yeah the specifics of how ai agents work versus ai workflows so my goal was to really give you a bit the the curiosity the interest of understanding a bit the anatomy of ai agents

Why This Happens: Agents, Tools, and MCP

mcps etc as we start to actually tend to rely more and more on this this autonomy that these agents have cloud card for example you tell what you want to do and it will call a bunch of tools that it has or that you provided via mcp etc you have a question yeah yeah yeah um so an

MCP (Model Context Protocol) in Plain Terms

And MCP stands for Model Context Protocol. And basically, it's for a tool like, I don't know if YouTube has had an MCP. Maybe it's the case. They would list all the API calls that you can make to YouTube.

And you give that to an AI agent, which will have the list of possible tools and be able to call them for you. And yeah, many people use that in their cursor, cloud code, anti -gravity, etc. Because it's very convenient.

It has the list of all the tools, maybe up to 100 tools it can call. And it just does it for you when needed. Actually, there is a lot of efficiency that can come from actually really mapping what you want your AI workflow or AI flow to do.

And we'll be going through the different problems or challenges that AI agents have.

How Agents Actually Execute Tool Calls

have the first one is so your agent has access to tools that's what makes it an agent and it will then have some sort of autonomy to call these tools that you have provided for example one tool can be google calendar list my events and one can be google calendar create a new event

So if you tell this agent, hey, check if I'm free tomorrow afternoon, and if so, then create an event for me, it will first call the first tool to list all your events. And based on that, either reply directly to you, hey, you're not free, when is a better time, or call the second tool to actually create that event.

and actually the the system prompt that you will give to this agent which will for example explain okay you are an HR assistant you do this that etc you have this tool you have this other tool

Hidden Cost #1: System Prompt Repetition

actually whenever there is a tool call and then the AI agent will have to think again what's the next move it will repeat the system prompt that you have provided over and over again so in this

case we were calling three like a tool actually two tools so the first ai run was the okay how to figure out what to do as a first step so it's um it called a tool then it has to rerun it's another api call to open ai for example to analyze that and think about what to do next so here if If you have a system prompt that's 2 ,000 tokens, it will be consumed three times, which has a cost.

Then, of course, if you have a system prompt that explains how to deal with five tools, it dilutes a bit how specialized your agent is, and it also gets repeated. So if it has five tasks to do, it will burn the same amount of tokens for each of these five tasks.

Hidden Cost #2: Tool Outputs Inflate Context

A third one, when it calls a tool, it gets the response of that tool. So for example, your list of events on your Google calendar, or here, that was a sort of perplexity is called a link up and you get your answer.

So I cut part of it, but you also get the full list of sources and the extracted texts from each source. and this will all count in the input tokens that you actually use for that

Hidden Cost #3: Model Overkill (and When AI Isn’t Needed)

agent so that's not optimal a fourth aspect is that if your agent has let's say four tools but one of the tool call is extremely simple you will actually use the same maybe big model as for all the other tasks so it's an overkill it's slow and it costs a lot and there are also some actions that shouldn't involve ai at all and we

might be able to see another i'll let you appreciate the memes i'm not good at doing them but i had fun producing those yeah sometimes you you would be more efficient instant predictable if you were just using logic and and code but of course it's convenient to just let an ai figure it out so you don't have to orchestrate it yourself so quickly the

An Optimization Playbook

Five Practical Levers to Improve Efficiency

benefits can be speed reliability costs and also the understanding of what really happens in your AI flow or feature or whatever so I've prepared five potential solutions and maybe you can tell me which ones you want to look into in particular.

So the first one is breaking agents into different steps. Instead of having an agent, you will have multiple steps of AI calls.

The second one is to actually route the intent to send the workflow to the most relevant path.

The third one is smarter tools, how to actually clean the output of a tool or have smarter tools fourth one is how to

convert AI steps into logical steps to not use AI at all in some cases and the

last one is kind of replicating the the way skills work you might might have heard this term with agents now with actually very simple AI logics so anyone

Anyone can shout the number you want to explore first.

I like the three, the smarter tools. Yeah, the three. So I, smarter tools, okay.

Deep Dive #1: Smarter Tools (Cleaner, Smaller Outputs)

So we saw that the response, the output of the tool call becomes part of the overall context of this agent. and if there are things that are not useful it's it costs token but also it creates noise and for that you can actually so in a 10 it's very easy to

Wrap Complexity in Sub-Workflows

build you can create a sub workflow which is something that you can call as an agent tool so that's the case here and whenever it's being called it will actually arrive here and do a whole process to finally output as a tool tool outputs the the result of your of your workflow and here this is the example of a rag agent that takes the user query and then calls a rag tool soon

so AI knowledge retrieval from a big database and we'll give it a try this is I'm usually using it as RT so it will take a minute but I think that knowledge base is about a biology course so we can say what are animals or whatever and let's see how this works so there are two aspects that we'll be able to observe ideally I can open that so the first one is that for the the way I

Batching Sub-Questions to Reduce Tool Calls

configured this sub workflow as a tool is that I actually ask it to input an array of questions for my rug which is a way to actually let the AI break the topic down into multiple sub questions it actually didn't do it here because it was very simple but if the question was complex in one tool call you can actually ask for an array which is a list of questions and they will all run together instead of doing three tool calls so that's one optimization and I

mean we just logged into this laptop it's it's not perfect so it actually the that the sub workflow didn't go through.

Post-Processing: Filter IDs, Aggregate, and Return Only What Matters

But basically the principle here was to split out this array of maybe three sub questions and then for each of them, run them through our vector database, which is this RAG approach,

clean the outputs to make sure we only output the keys that are relevant to us and remove all the IDs, et cetera, that are not relevant to have. And the last step is to just aggregate the three responses in a nice format, and we give them back to the agent that can work with that. So that's one way to optimize.

Any other that you would like to check? Yeah, which one?

Deep Dive #2: Intent Routing

Map and route intent. Yeah, so that's the second one. Yeah, that's actually an interesting one.

Use a Cheap Router Before Invoking Expensive RAG/Agents

So I have here an AI, like a simple AI chatbot where you can just ask your question and um you know when you create an ai chatbot you can plug in a very powerful agent that has for example rag as we saw earlier but actually sometimes the person just says hi and you don't need a big model to respond to hi or you don't need rag you don't

need all these tools you don't need don't need a big system prompt so here what i've added and we we can actually say, hi, I hope this works because otherwise that would be inconvenient. We have here just an intent router

and I don't know why I cannot, okay, that worked. And I just defined two categories. Knowledge retrieval is needed. Knowledge retrieval is not needed.

1So I could actually give a very small model. So I think it's a mistrial that really doesn't cost anything. And it actually gave that to an AI

that's that's just set up to like the system prompt is your helpful assistance and here is the message and it basically very fast very cheap answers to this intent whereas if the question was more elaborate it would actually go to like this tool that would query another knowledge retrieval tool and then the AI

would craft the final answer and for hello how are you you don't need that so So if you have more intents, for example, informational intent, commercial intent, etc., you can actually tailor your agent to handle that with a very cheap router here that costs 0 .002 cents to route.

So the router just decides that the high has not to be answered by the complicated rack agent, but it's not answering the high itself. It gives the other agent which then answers the high. it gives to another that actually is connected to the same cheap model so you're you're yeah maybe very quickly going through a final one um

Deep Dive #3: Replace AI Steps with Deterministic Logic

yeah i think that the the one about converting ai steps into logical steps is really interesting

um i have so that's a website on which we've implemented an ai chatbot and i will just say it actually works in the same way to respond to a hi, I think. Or actually not this one, it was a couple years ago.

Example: Email Collection Without an LLM

And I'll just send a couple messages because we have set up a way to verify. Yeah, here. After five messages, it asks you for your email.

And here we have a workflow that basically checks the response. And if we know that we're expecting your email, there is no need for an AI to check if there is no at sign that looks like an email so like that's

the big agent that does all that and you will see that there here is my agent and there are many paths where we actually don't use the AI to her response because we just prepare a response for example, here, it doesn't open, but that basically says, looks like your email is missing

and we don't need AI for that. So that's another way to, yeah, optimize your agents.

So I'll be stopping there and we can chat later about the other ways.

Conclusion: Choosing the Right Approach

If we summarize, whenever you need something that works forever and to build an AI feature that has steps that you can actually plan and you know how it works,

then definitely consider orchestrating these AI steps, having also some logics, et cetera, to do things properly and to have control over what's happening.

If you need some level of autonomy, for example, customer facing chatbots, where it has to kind of know what tools to call and in what order and to adapt based on that,

you can consider agents, but you can optimize their tools and also make sure that before you surface the user message to the agent, you will fetch all the data that you need to also provide directly to that one agent that will be in the best position to take all the decisions.

And then if you need punctual assistance building something, you can use these agentic solutions, MCPs that really abstract many aspects and are very convenient but definitely if you are building something that needs to run forever there is a lot

of optimization to look into and it's actually really interesting to understand how it how it actually works what is in a tool call etc so thank you for listening