Getting Started with AI Agents, using examples from finance

Introduction

Right, so this is like the third time I give some version of this presentation. So hopefully it works for what we're trying to do today. But like I said, I'm going to move fast.

So what is Fitch Group? Just so you know where I'm coming from.

Overview of the Presentation

What I'm going to talk about today is AI agents, a very basic example of AI agents, and then AI agents, how we use them in finance.

So Fitch Group, you don't have to read this whole thing. Just know that we are composed of primarily Fitch Ratings and Fitch Solutions.

We provide financial data, research, analytics, and then the thing that we're probably best known for is credit ratings, which is a very important part of the finance world because a lot of investors use these ratings to inform their investment decisions.

We have a bunch of offices. We're about 5,000 people worldwide co-headquartered in New York and London.

I have a lot of people in London on my team, so I agree with Josh. They're always talking about what are the risks and what's the governance. And I'm always like, we need to move faster, which is fine because you need that balance to make things work.

AI Agents 101

All right, so Agents 101, show of hands, who has worked with agents or used AI agents? OK, so also about a third of the room.

Who has no idea what's an AI agent, or you've never heard about it? OK, so a couple of you.

All right, so an agent, at a very fundamental level, it is an autonomous program or system that can perform specific tasks on behalf of a user, that would be you, or some other program. Today, I'm going to focus on systems that use large language models to control the flow of the software application. I'm going to give you some learning resources at the end, so if you want to do more on your own time. By the way, let me start a timer because I have a lot and I want to make sure I don't go over the time.

Key characteristics, and you'll see all of these in the demo. So they have autonomy. They can make decisions and operate without continuous user intervention. They can perceive many types and modalities of data. They can take actions.

This is really, really important. Probably, to me, the most important part of an agent is they have tools available to them to execute actions. And then finally, they're goal-oriented.

So imagine a self-driving car that you haven't put your destination into the GPS. It doesn't know where you want to go, so it's not going to do anything. So you need a goal for the agent to do something.

Key Characteristics of AI Agents

Components. Remember these, because I'm going to be talking about them during the demo. They have an action space, which are the tools they can use. They have the ability to plan. So if you give them a very complex task, you also have to give them the means of understanding how to break that down into subcomponents and how to reflect on whether they've used and planned those subcomponents appropriately. And then finally, memory. You need to keep track of what you've done so that you know what to do or so that the agent knows what to do.

Components of AI Agents

All right, so I'm going to talk a little bit about autonomy. Again, I'm going to move really, really fast here.

But human-driven software processes. So you have code where you, the user, the human, you have to instruct everything. So you have to tell the machine every single thing and every single instruction so that it can work appropriately.

Number two, you have an LLM call. Now we're starting to look at... some steps where the AI can do.

The LLM call is just a single step. You also have LLM chains where you can have a series of steps.

You put in an input. You have an LLM. You might take some other steps, maybe retrieve some data to enhance the input or to ground your knowledge base so that the LLM doesn't hallucinate. And then you send that back to the LLM, and then you send the response.

Now we're starting to get into agentic behavior. So routers and state machines. I'm not going to talk too much about this because you're going to see it in the demo.

And then finally, a fully autonomous agent. The fully autonomous agent, it has a series of actions that it can take. And then the agent itself decides, am I done or do I need to take additional actions to meet my goal?

Demo: Basic Agent

All right, so demo, basic agent. and this example is based on wait sorry bear with me where is my here we go okay so this example is going to seem really basic but i promise that we're going to see a more interesting one next but i just want to show you what are the differences in levels of autonomy so here i can under the hood there's an llm and i can query it

This is a very simple LLM. It doesn't have access to any tools or anything.

So for those of you that use ChatGPT regularly or maybe on Tropic Cloud, know that there's a lot going on under the hood. So those chats already have agentic behavior. So don't think of that. This is pretty much a plain LLM input output.

So this query here is going to be the input. I can ask something like, let me copy this from here. I can say something like, hey, what's the weather like in December in New York City? I'm traveling there, so I would like to know what to pack. And if I send that, it's going to take a while to come back because I'm on the hot spot. It's going to tell me, hey, you know, in December in New York City, blah, blah, blah, blah, blah. And it's going to give me a lot of information that it's already in its training data.

However, if I ask a question like, what's the weather right now? What do you think it's going to tell me? Well, if you've interacted with some of these models, especially a long time ago before they had a lot of agentic behavior to them, it's going to say, I'm unable to provide real-time weather updates, et cetera, et cetera. You probably remember using ChatGPT back in the day, and it said, I don't have any knowledge past April of 2023, or whatever the cutoff for the training data was.

OK, so what can we do to improve this? So a collaborative agent, not yet fully autonomous, but now I give the LLM some tools. And I'm going to show you the full response so you see what I mean.

So now I can ask, what's the weather in New York? OK, so let's ask this. It's going to fetch some results. It's going to take a little bit longer because now it has some tools available. So there's a little bit more that it's doing in the chain. OK, so the weather in New York right now is 33 degrees clear. If you just walked in, you probably know, yeah, that's accurate. It's about 33 degrees.

And if I show you the full API response, you're going to see that not only do I have an AI message that is asking, that's basically the query that I sent to my LLM but now the LLM has a tool available to it is an API that it can query and that API takes that question and says alright let me call the weather API with the location New York I get a response and the response is telling me alright you know, there's a bunch of data about the weather in New York. I pass that data back to my agent, and then my agent creates a final answer, which is what it serves to me as the user.

But I still have to say, hey, what's the weather right now in New York? If I just say what's the weather right now, what do you think is going to happen? It still gives me the weather, but it gave me the weather for, Location, name, I have no idea what this is. Oh, Philippines. OK. So, it gave me the weather for somewhere in the Philippines. How did it choose the Philippines? I have no idea. Again, this is a demo, so I haven't really dug too deep into it.

Finally, now I have a fully autonomous agent. I'm giving it an additional tool. And what I'm doing here is I'm using the Chrome location API as a proxy for what a tool would be. Now, typically, this is not something you would see when you're using the web. Here, I'm just going to show you. It's saying, hey, location, I'm in Maryhill. We are in Maryhill. We're close enough.

So now I can ask, what's the weather today? And you're going to see that it is going to give me the weather for New York because It knows that I'm in New York. So if I show you the question that gets passed to the weather API, because I've already given the agent location data, it didn't ask the API what's the weather. It asked what's the weather in New York, even though I didn't ask what's the weather in New York. So now I have an agent that has access to two different tools, and it knows how to use it.

So everybody's going to say, OK, cool, but that demo kind of sucks. Can you give me some more interesting stuff? Yes, I can give you some more interesting stuff.

Using AI Agents in RAG Applications

At Fitch, we have internal and external client-facing RAG applications. If you are familiar with the RAG application, it's basically a chat bot that has a specific knowledge base A lot of companies right now have them, so you've probably interacted with one of them in the last several months.

Okay, now RAG itself, let me remind myself, do I have some diagrams here? Yeah, so LLM chains, this is basically how a RAG application works.

The problem with a RAG application is there's a lot of components that you need to define ahead of time. For example, how many documents do I want to retrieve in order to answer the question that the user asked of me?

Depending on the complexity of the question that I'm asking, the LLM may or may not retrieve enough documents. How can I solve that? I can solve it by using an agent.

And the tool that I give to the agent, so let me just read you this really quick. You're an AI. You're an assistant helping a financial analyst, blah, blah, blah, blah, blah. And now here's some context, which is the user question. That would be a simple RAG application.

Now, for the agentic RAG application, I'm saying the same thing. Hey, you're an assistant, et cetera, et cetera, et cetera. But you can do one of two things. You can search for document chunks, or you can return an answer. Only return an answer once you are done searching. Very important.

I am providing it with what I call a scratch pad. In that scratch pad, it will retain its memory. So it will retain only the information that it thinks it's relevant to answer the question and also to plan the next step that it has to do.

Simple RAG Application

Enough explanation. What does that actually look like? Let me run through this notebook.

What I have provided here are roughly 10, 11 companies. It's basically these companies. I have provided quarterly reports. All of these companies are public. They have to report to the SEC every quarter.

So there's two years, roughly, what would that be? Eight to 12 reports for each one of these. So there's about 120 documents in my knowledge base. And I'm just going to ask questions.

At first, I'm gonna start with a very simple question. What was Meta's total revenue in 2022? Okay, so let's run through this. So this is the simple rag. I'm just running through the simple rag.

This doesn't have agentic behavior. And you can see that this is calling the, I'm using GPT-4 under the hood. And to show you that I am actually using GPT-4 under the hood, this is at personal cost to me to do this demo for all of you. You can see that I'm incurring costs right now. I'm doing calls to the API. All right, so now. You're welcome.

Comparison with Agentic RAG Application

It's about a fraction of a cent per query, so it's not too much. All right, so you can see that the simple drag agent is telling me, hey, the Microsoft total revenue in 2022 was 198. For some reason, it's formatting it in European style, I think. So 198 billion. And then metas, same thing, 116. And you can see it retrieved five documents. Why? Because my retriever is looking for the five most relevant documents.

Great, cool. Same information for meta. You can Google this and see that it's true. Now, what happens if I start to ask more complex questions?

Here I'm just defining some steps for additional visualization. OK, so here I'm asking the exact same question.

One thing that you'll notice is that there's a lot more calls happening. to the API. Again, this is a demo. This would be very costly if you have to do this thousands and thousands of times per day with many, many users.

At Fitch, there's a lot of steps that we do to make this easier. So every single step doesn't necessarily use the most expensive GPT model. If you're interested in any of that, talk to me. There's some presentations online that I've done elsewhere talking about some of the things that we do to reduce the costs. in the actual production application.

Okay, and now if I visualize these results, let me show you this. Hold on, it's still doing the third one. I'm getting a bunch of responses from OpenAI. Okay, so now you can see every single step

is fully autonomous. I didn't tell the LLM to do this. The LLM decided what it needed to do.

So step one, it's an agent. It's making a decision. It decided that it needed to search. So step two, it's searching. It's querying documents and retrieving documents. And here you can see these are the documents that It retrieved.

In step three, it said, OK, I'm updating my scratch pad. That means I'm only keeping enough information in memory that I need to complete the rest of the steps. Step four, again, I'm making a decision.

Step five, it decided that it has enough information to answer the question. So it's serving the question right back to me. And it's saying, hey, the total revenue was 198, which is the exact same answer we got previously.

Now, give me one second, Agent Rack 3. Sorry, I forgot to show you this question. There is a question. Okay, so the third question. was what was the total revenue in 2022 for both Meta and Microsoft?

That question is more complex than just what was the revenue for Meta or what was the revenue for Microsoft separately. You can see that the simple rag didn't answer the question. It told me, I don't have information on Microsoft's revenue. However, for meta, the total revenue was whatever. Why didn't it have enough information?

Because it retrieved five documents. And in those five documents, it didn't happen to find the information. Now, my agentic rag, you'll see here, It took several more steps. So again, step number one, it's an agent. It's making a decision. Step number two, it's searching.

You'll notice that most of the documents are meta documents because it's searching for meta first. Updates the scratch pad, makes a decision, and now this is the important part. It decided I don't have enough information to return an answer to the user yet, so I'm going to do And next search.

And now you'll see that all of the documents that it retrieved are all Microsoft documents. OK? Let me update my scratch pass because now I have information for Microsoft. Let me make a decision. Now I have enough information to return the answer to the user. And I can say the total revenue for meta was whatever it was. The total revenue for Microsoft was whatever it was.

Conclusion and Takeaways

OK, I'm about to finish.

What happens if I want to ask which company had the highest revenue out of these three companies? I hope this runs. It might time me out because I've made too many queries in a row.

So while that's running, let me return. Here. I just want to give you some takeaways.

Yeah, that's still running, and it's still getting good responses. OK, I hope it continues to run. We'll see the answer in a little bit.

But things to remember. So what are the key differences?

Autonomy, very important. Software does not have autonomy. An agent does.

Software may perceive data. But an important part of the agent is that it actively interprets that data. And it has autonomy on how to interpret it.

The agents can take actions. And again, they have goal-oriented behavior.

Are we finished? Ah, it didn't run. Too many requests.

But come see me after the presentation if you want to know more about this. With that, any questions?

You're going to be here afterwards, right? Yes. Perfect. Just so that we don't run over and have cold pizza at the end.

Perfect. Good. Yes.