LLM at Work - Using Agentic LLMs for Data Processing and Decision Making

Yeah.

Introduction

Hi, my name is Eric. I am one of the directors at Hivemind Technologies.

We're a consultancy focused on helping our clients build cloud data and AI machine learning platforms. And we've got offices in Berlin and in London.

And today I will be talking about agentic large language models for decision making. So that's a bit of a mouthful, isn't it? And I hope I will shed some light on what that means.

Common Uses and Challenges of Large Language Models

I'm going to start with what most of us do when they work with large language models. So we've got an application, and we want to use generative AI. So the most obvious thing that we're going to do is to use one of those inference endpoints, as we call them, like APIs, and hook them up with our application, and just call them, and hope that we're going to get some really good results. We're basically asking a very large LLM to give us answers, and we're going to spend a lot of time optimizing our prompts to get good results.

Of course, we all know that the price of tokens on those public APIs is, you know, there's a race to the bottom and it's getting cheaper and cheaper, but it is still expensive. And a recent survey revealed that a lot of companies that are basically early adopters of, let's say, those API approaches are dissatisfied with the cost benefit of those implementations because those large language models basically, well, they tend to break the bank.

The other issue is that latency is also not what we expect, right? It takes some time to get that answer. I mean, we all know that when we just ask chat GPT how long it takes to generate that answer. And if you implement an application that does that, well, latency has to be controlled.

And I mean, that's kind of obvious because those large language models are huge and they've trained on petabytes of data, right? And of course they have to generate that answer and they're hit a lot. And despite the fact that we was spending billions of dollars in infrastructure to buy yet even more Nvidia GPUs, this is still an issue.

And lastly, I mean, this is Germany, right? So privacy is a real problem, right? There's a big concern of leakage, of compliance issues, and GDPR.

So there must be something else.

Typical Applications of LLMs

So if we just take a step back and look at what we want to do with large average models most of the time, we want to do classification. We would just want to say, look, I've got text. You can tell me what it is.

That basically is really useful when you do some, let's say, parsing of text, of documents, legal documents, or emails, or letters, or whatever.

Summarization. We all use that when we want to summarize our meetings. We create a transcript in the team session, and then we ask the large language model to summarize that meeting.

And lastly, we want to also use large language models to do extraction. So basically, read a text and find out addresses, names, anything, like numbers or specific aspects of specific patterns.

Funnily enough, on the privacy thing that we discussed earlier, the funny bit is that large language models are extremely good at detecting personal information and can be used to fix privacy problems. You know, so basically two sides of the same metal, right?

Other tasks are retrieval oriented. A lot of people use large language models to generate answers from specific databases or access, use large language models to query a specific control data set.

And to a certain extent, we're also expecting large language models to have some limited reasoning, especially when we do programmatic embedding. We want the large language model to make small decisions that we can to a certain extent, control.

Of course, the internet, and especially Twitter and LinkedIn and so on, is full of examples where basic logic is violated by large language models. And that's pretty clear, because large language models can't reason like we do. They do not understand logic. They are only very, very good at generating very likely or most probable, most accurate answers.

Challenges with LLMs

So that brings us to another problem with large language models. It's been discussed a lot, confabulation or hallucination.

Basically, because large language models are probabilistic, they're basically like management consultants. They will generate an answer that sounds right but is not necessarily right. And I can say that because I am one. I'm a consultant.

And another aspect is that large language models internalize their knowledge. Everything that a large language model tells you is limited to what the large language model was trained on. So if we look at ChatGPT-3, for instance, well, its knowledge stops in 2021. Because there's no additional, they were not trained beyond that point.

So, and specifically when we want to change that and we want to use L and M centric methods, we have to retrain the model or fine tune it. And that's complex and expensive.

And of course, as we discussed, privacy is a problem, especially with the, with like the public inference APIs, right? Gemini, OpenAI, et cetera, because they use the generated queries to train their data sets. So there's this famous Samsung example where they basically found their source code in generated answers because they used it without noticing that that usage pattern would be used to train the model.

Solutions and Advanced Approaches

So one way to solve this problem of controlling the results is to use a pattern called RAG, retrieval augmented generation. So the idea behind RAG is to say, OK, we've got a large language model.

And we know that large language model only has the knowledge of, let's say, 2022. And we want to use a very specific controlled data set.

So we're going to use an external data source, which is a vectorized data source, a database or a vectorized store of sorts. We're going to use and embed the content so that it can be accessed. And the large language model is instructed to generate its answers from that store.

This is a very potent approach to controlling results because it cannot confabulate that easily or hallucinate that easily because it has to answer. It's instructed to answer from the vector store, which makes the result much better and has an added benefit.

You can update the store right, the data set, without having to retrain or replace large language models.

You can use an off-the-shelf large language model, instruct it to use the vector store, and limit the answers to that vector store, and basically control your schedule. That's already a much better approach to building LLM-based applications than just using the initial naive approach of talking to an inference point.

Retrieval Augmented Generation (RAG)

However, controlled retrieval isn't really enough, because RAGs, first of all, they're still limited to a single large language model that handles the extraction. And there is absolutely no ability to deviate, because essentially, in the first approach, you have a prompt. You have a large language model, and you have an answer. And you have to hope that the results are fine, because that's just the single source, right?

When you use an RAG approach, you still talk to that same LLM, but it's instructed to get its context from the RAG store. But there's not much variability in this approach. So there's not much that you can do beyond the capabilities of that specific large language model.

So you're back to using a larger model that has some baked-in capabilities, which introduces latency and cost. And it's one size fits all. That large language model has to fulfill all the needs of your application in that context.

Programmatic Approaches and Agents

So now what we're trying to do is go beyond that approach. And this is where we're basically thinking about programmatic approach. So programmatic approaches means that we're using code to basically build multiple implementations for different purposes within the generative AI application.

And there's a famous framework, Langchain. There are others, but Langchain is quite potent.

I think my microphone's coming back. Fantastic. Thank you very much. Hello, hello? Ah, great. I don't have to yell anymore. That's good.

And using that framework to build your application gives you a few built-in examples. First of all, you don't have to necessarily create handcraft prompts to interface with a large-angle model. You can use templates that can be basically controlled in code. and receive variables that you can basically then customize.

You can implement query endpoints, which is really cool, because beyond RAG, where we basically have to use vector database, when you use this framework, you can also implement traditional queries to databases where you basically execute a SQL query against the enterprise database, right? And you integrate those results in the flow of your application.

And lastly, there's one bit that's quite interesting and it's agents. And this is what we're talking about today. Agents as a building block for massively more complex but also more potent applications using large language model at the core.

So what does that specifically mean and what is an agent? An agent is a specific, it's a building block, a component of code that uses an embedded large language model as its decision making process.

So for all of those that are coders, we're used to implementing deterministic programs. So if this is the situation, then we're going to execute this. If that happens, we're going to do this, et cetera. So deterministic.

And with agents, we're flipping that approach on its head because we're using a large language model to decide which flow of execution the program will take. Sounds a bit scary, but it's actually pretty cool.

So what the model actually does, it says, I'm gonna get an input, I'm gonna parse the input, and then I will decide what to do. And the what to do is defined as a set of tools that we give the agent. So the agent can have a calculator, for instance. or a query interface against a database.

So the interesting thing with this agent is that you can actually ask, you can create a program that can actually do basic calculus. Of course, everybody knows how funny it is to ask chat GPT, well, the earlier versions of chat GPT, a question like, what's 12 horses plus five, how many animals are 12 horses plus five squirrels? And get some stupid answer. Well, with an agent, The agent is capable of parsing the numerals in that query, pass it on to the calculator, which is a tool, compute the answer, and generate an answer, and give you a correct answer, 17. That's really cool.

So how does that work? Well, it works in the sense that, and that's sort of similar to a chatbot to a certain extent, it gets... an input, a natural language query from the user. Then it parses that query and decides and dissects it into decisions, into a plan, a plan that will be executed using different tools it has at its disposal.

If we go back to the example with the animals, it will determine that there are integers that need to be calculated. So we'll use the calculator, which is part of the plan. And we'll use a generation as well to generate the answer. And it returns the result of the executed plan.

So the interesting bit about an agent, it is absolutely capable of deciding which... snippet of code it will execute without having to be deterministically programmed to do it. It generates the most probable outcome from the query that it gets.

So this is now a very powerful aspect because you can chain these things into a workflow. So a chained execution of agents can create a complex workflow with different outcomes.

You can have also situations where you have inputs that are supervised by other large language. So you basically have a generation workflow that says, okay, this is the input. We're gonna do some calculation. We're gonna generate a text. And as part of that execution, you have a supervisory agent that validates whether the output complies with certain rules.

One example could be PII detection. You could have a generator that generates text, and another supervisory agent that parses the text that's been generated to validate whether A, it contains leaked information, B, personal information, and then can mask it away. So you have that ability of creating flexible programs with multiple embedded large language models and different choices as well.

Use Cases of Agentic LLMs

So when we look at use cases, what can we do with this? So one use case that we've been using at Hivemind is using agents to create automated recruiting advisory tools. So what it does is basically parses CVs for a role, generates and summarizes them, and matches them with other rules using the agent framework. So this greatly streamlines the overhead for recruiters.

If you're a recruiter and you have a role to fill and you get 800 applications, Normally, you just look at the first 50, and you pick the 10 best, and those are the ones that you invite to interview. But that's not necessarily the best applicants out of the 800. With this tool, you parse 800 applications. You pick the best 50, and then you look at the 10 that you want to invite.

Another use case for agentic implementation is observation. And this is quite interesting, especially in industrials. In industrial application, where we're basically looking at IoT, where you have essentially a large data set of, let's say, time-based sensor data. you're in the energy sector and you're observing a wind farm or a power plant, you're getting a lot of time-based data sets.

Well, one of the really big problems in that area is observation. You need a human being to look at the trends and anomalies and all of these aspects and to make decisions on the basis of what they see. You can use, to a certain extent, machine learning to do that. That's true, with models that will predict outcomes.

But one thing that you can't really automate is alerting and decision-making. And that's one thing that you can do very well with generative AI and energetic large-angle model applications, because they are trained to make decisions based on the data that's represented and then generate an email or call someone or create an alert or do things that we would otherwise do manually with less efficiency. and also a lot of night shifts.

Flagging and marking is another one. So essentially leveraging the ability for comparison to flag anomalous behavior, anomalous patterns in text. It also greatly decreases the cost compared to machine learning, traditional machine learning where you basically have to train and consistently train and retrain and deploy your models.

Document parsing is another one that's kind of obvious. We've talked about that a lot. But in a lot of, let's say, digital transformation projects, especially in, let's say, businesses that have a lot of paper, let's talk about insurance, for instance, where you basically need to parse policy documents and also transform policy documents, essentially, But let's for a minute assume that you're in insurance and you've just merged with another one and you want to create a common database, a common representation of all the policy documents, policy schedules that you've got.

Well, large language models are much easier and a very potent answer to that problem rather than outsourcing stuff to clerical works and clerical companies that do that. Another big one is at the cusp of vision, computer vision, where you're basically in the energy sector where you're basically reading meters. Somebody takes a picture of a meter and you want to evaluate the cost and the reading of the meter and implement that. You can use large average models to do that and to greatly simplify the process.

And lastly, in legal tech, which is a big profession that is quite resistant to change, using large language models for everything that has to do with paralegal work, processing large quantities of documents, summarizing them, especially in corporate legal fields, it's a huge benefit for the profession to leverage large language models. And specifically also everything that has to do with advisory and co-piloting, essentially. I mean, depending on, let's say, the profession and the legal field, looking at precedents and other cases is a quite potent approach to simplifying the legal work. And that's where large language bonds can greatly help that field.

Conclusion

So thank you very much. This was an introduction to agents. I hope you have a better understanding of what it is.

And I'm happy to take questions.