OK, thanks for the presentation, Carmen.
So in this first more theoretical talk, we are going to talk about bank reconciliation with Gen AI. I guess you all know what Gen AI is, right? We are in an AI meeting. Gen AI is just this AI that can create content, whether it's text, speech, images, videos.
But probably many of you don't know what bank reconciliation is. So let's start with a brief definition of this problem.
So, well, my name is Pablo. I work as a data engineer for Embat. It's a Spanish fintech.
and we've developed a software as a service for finance team for medium and large companies right Embat automates different tasks for this finance team mainly payments treasury and debt management, and accounting processes.
And today we are going to talk about these accounting processes, specifically bank reconciliation. So our customers and any company in the world needs to reconcile their payments and collections in their ERPs, right?
And as you can imagine, this mapping, it's not a classical machine learning problem, just because we have two different entities, which can be related one-to-one, end-to-end, or one-to-zero. One-to-zero can be, for example, taxes or salary payments in your companies, which don't have an invoice in your ERP. Some examples for the end-to-end can be an invoice that is paid half of it when it's issued and the other half 60 days later or whenever.
And it's really important what I mentioned about the data structures, right? So we have a lot of information coming from the ERPs in the invoices side. Basically, you know who the beneficiary is. You have the document ID, the payment method.
But from the other hand, the transactions, or at least the way with the PSD2 regulation you can get that information, you have really few data points. Basically, you have the amount, the date of the transaction, and the description.
So the problem we face here, as I mentioned, this is not a classic machine learning problem. So we need to find other ways to tackle it.
So yes, the main problem we have and the one we solve with GenAI, it's structuring the few data points we have in the bank transactions. You can imagine in your bank transaction, in the description, you can have a lot of information, but not a structure, right?
You can have, for example, the beneficiary written down, because when you make that payment in your bank web, you just type it. Or you can have the payment method, because many banks inform it. and you can also type down the document ID. But this is just a long text altogether and you cannot infer a simple structure to get all these data points.
Basically, the problem we face is structuring this data, so we need to use language processing models.
When I say this, you can think about NIR, Named Entity Recognition Models. And we could use those. And there's actually no need to use LLMs.
But well, there are a couple of problems. The first of it is that this kind of text, it's special because it doesn't have a lot of fixed structure. It can depend on the bank.
Actually, we've tested different pre-trained nerves, but they don't work really well. You need a lot of fine tuning, and you don't achieve really good results. At least we couldn't achieve really good results with it.
Also, I mentioned at the beginning, we are a startup. We've been working for two years, and the data team, when we started working on this, was one person. Me. Now it's a bit bigger.
But we didn't have a lot of think capacity to train an in-house model, fine-tuning these NIR models, and so on. And LLMs are basically plug and play. I say basically, well, there's a lot of work behind.
And also, when we started working on this, I think it was Q2 last year. the solution we have right now didn't exist. So we iterated a lot of different solutions until we got here.
We work with Google's Palm tool and with batch request to the models, which went live like Q3, Q4 last year. That's the main reason we choose an LLM, to structure this data in the transactions. And I also mentioned we work with Google Spam, too.
There are many reasons. The first of them is that we have all our infrastructure in Google Cloud, but the solution could also work with ChatGPT. Actually, we did some proof of concept with it, and it also worked pretty similar.
Well, Google also have a program for startups, and they have credits. So it's really cheap for us to use Palm 2 instead of other models. And the main reason is for compliance.
So as I mentioned, we have all our infrastructure in Google Cloud, so our data doesn't go out our virtual private cloud. It's all within our project. And actually working with financial data, that's super important.
So yes, I mentioned that this is not a classic machine learning problem. But also, this is not just sending the data to Palm and saying, OK, solve these matching transactions to operations for me, right? There's some work behind, obviously.
Here are some results we had. Well, actually, we have. We are close to process one million transactions a month. So we need a model which can scale.
And also the most important part going into the details on how we build the solution. We need to be able to process these responses for about one million transactions in a month.
So imagine asking, I don't know, a simple problem I was imagining before. If you ask ChatGPT or Palm or whatever model, three attributes for different cities in the world.
So please, chat GPT, tell me the population, average temperature, and number of sunny days in a year for Madrid. The fair answer, it could be the population in Madrid is whatever the average temperature.
This is a long speech. Then you ask the same question with the same format for London, and it says, temperature and the data population and the results and whatever and you cannot process that data right because it doesn't have a structure as I mentioned the problem we had is that we have unstructured data and we want to structure it so if the response is not a structure we are at the same point
So here comes LanChain. This is your best friend if you're looking to prompt engineering jobs.
The official definition, it's what it says in the point one. If you go to LanChain website, it says it's a framework for developing applications powered by language models.
Basically, we benefit from it in three key points, points two, three, and four. They have prompt templates, depending on the language model you are using, right? Because if you ask the same question or prompt to different language model, they will respond differently. So they've already worked in certain templates that work better for each model.
They have a lot of tools for data input and output. So what I was mentioning before about the cities, basically you can prepare with linkchain that your response should be a Python array, a Python list. You can translate Python to JavaScript, Go, whatever you prefer, and so on. And the same for input, right?
You can pass to the model embeddings, arrays, dictionaries, files, like CSV files or whatever. with launching, it's super useful for that. And finally, it has also tools for managing your pipeline, right?
So this solution, as I mentioned before, you can get 90% of responses, but you need to treat the data. And at the end, in our case, we go to 40 to 70 data and reach. And because there's a lot of processes behind and in some cases you need to call a model twice for the same transaction. So first it approaches a little bit to the solution or give three or four options and you do a second call to decide which of the three of them it's the best one.
So It has all the tools to create your data pipeline and to call different models, get your data from your database and then store it somewhere else.
This is basically how we productivize a solution and how we got to improving 50% of our automated accounting.
Linechain also have other different tools to serve the solution as a REST API and whatever. We don't use this part, but you can look for the information in their website. They have a bunch of tools.
Actually, when we started working with it, the same they had just like these template things and the data input and output and the other day working on the slides i went into the website and it was like wow this is a new different tool so yes i think we are on time so thank you very much we have five minutes for q a name