LLMs Beyond OpenAI - Diego Perez

Introduction

So hello, everyone. Thank you for coming. My name is Diego Perez, as Carmen said. First of all, thank you to MindStone for having us here today.

To kick off, I want to do a couple of questions to see what the audience feels. I want to know how many of you have heard before of Clarity AI?

Oh. More than what I expected.

And second, how many of you are familiar with rack systems in LLMs? Okay, cool.

So, as Carmen was saying, I'm Diego Perez. I'm lead machine learning engineer at Clarity AI. I'm leading the efforts in generative AI or everything that has to do with customer-facing generative AI.

About Clarity AI

So we are... I'm going to be technical, but first, as many of you don't know anything about Clarity, I think it's interesting to have a little bit of context on what Clarity is and what we do to understand what technical challenges are we solving. Then very quickly, what do we do with AI at Clarity AI? And then we are going to go technical with the generative AI deep dive.

Company Overview

So first, intro to Clarity AI. We are a sustainability tech startup, startup or maybe scale up already. We are 300 people with offices in Madrid, New York, London, Paris, Dubai. So pretty much people all around the globe.

Remote first company with, I think I said it, more than 300 employees. And what we do is basically sustainability tech for all segments.

Mission and Services

First we started with investors and we want to get the societal point of view to the investment markets, to investors. That means funds, any asset manager. we'll be able to see another framework beyond financial framework to understand how the companies are behaving on the societal point of view.

Then also for companies and even lately launching for customers and users, so you would be able to see the carbon footprint of whatever you're buying and so on and so forth. So that's about clarity.

AI and Machine Learning at Clarity AI

And we're going to be talking about generative AI today, but we do a lot more of AI. So first, I will start with, we have like this division, quant, NLP, and then gen AI.

Quantitative Models

On quant, we do kind of classical machine learning with reliability models in the sustainability business. As you can understand, the data is very spread out because everything are estimations. even when you see these are that many tons of CO2 one company is emitting we have many providers and there are orders of magnitude even difference between one provider and the other so we have the reliability models to get the most reliable data we have estimation models whenever the data is not there we go and estimate and causal inference

Natural Language Processing

On the NLP side, we've been doing NLP for seven years already, so we are not new with this wave of LLMs. And we were using LLMs well before OpenAI kicked off or was mainstream. We do one metric extraction. That means we get PDFs, reports from companies that are publicly available. And with LLMs, we extract data from there.

like tons of CO2 that the company is emitting, we do corporate sustainability, report parsing and also news engine and more importantly on the GNI side, we use generative LLMs to leverage all the data that we have extracted with LLMs, with data providers, with our news engines and we use this as an interface to interact with all the clarity knowledge. So very quickly this is what I meant when I was saying extracting data with LLMs you have a report we go there and using a NLP we are able to extract and even you know get the banding boxes or where the data or the evidence was found on the other side on LLMs we have been using BERT models for long like LLMs to detect whether a company is committing to a policy or not and also to detect how the news are talking about the company meaning like if you know volkswagen diesel gate or facebook and bridge analytica whenever the news came up we are able to detect this in real time and assign a severity to this controversy and the end users will see this i'm going to the generative ai um i guess uh we were uh

Generative AI and User Interface

trying to improve the workflow. The idea is that we have many different clients, many different, a wide variety of end users that can go from investors to end user. So we were planning to have this kind of everything interface to access all of the Clarity's knowledge. And this will be, you know, will enable us to clarify the concepts, interpret data, condense information, so on and so forth.

Introduction to Rack Systems

So I'm going to go directly to a brief intro to rack systems for those of you that are not familiar with what a rack system means. So a rack system, this is something I got from Deep Learning AI to give you an overview. But this is what a rack system means.

What is a Large Language Model?

We are talking about LLMs. An LLM is a large language model. More precisely, we are talking about generative LLMs, not encoders.

Challenges with Generative LLMs

And this kind of chat GPT style models have mainly two problems, right? The first one is hallucinations, they make up information as if it was true, but it isn't true. And the second one is timeliness, because the model was trained on February 2023, or September, or whenever. So we want to have information up to date, and we want to have factual information.

Implementing the RAC System

RAC system is you have the private knowledge base, so all the knowledge from a company, in this case Clarity AI's knowledge, what we do is put this knowledge into a way that the LLM can consume it and this means for the unstructured data we are using an embedding vector database. And for the structured data, we are using APIs to access the data. So we are giving this piece to the LLM.

So we have this embedded model. We store all the information in a vector database, and then we feed this into LLM. So as you can see here, the first step is you have all the information. You split it. You put it machine-readable and ready for LLMs.

and you do all this splitting and store this information and we can retrieve that and then whenever a user enters a query or chats with the chatbot we go to this storage and we retrieve the most relevant pieces of information that are relevant to the query and we feed that to the LLM to answer the query just to give you a little bit of the problems that this can bring and it's not as easy as it seems it's like for example if you have this retrieval and it's a random example that you are going to retrieve information about a mushroom when you are doing a recipe and if you ask this question and you get the most similar ones you would get these two pieces but you are leaving behind the one that is saying that it's poisonous right so you want to ensure that the info the not only the relevant information that are similar to the query but you want to maximize the entropy so you make sure that the information that you need is there.

The Generative AI Platform

So with that we've built a generative AI platform that is able to give support to all of the Clarity AI's generative AI use cases and it's not only the chatbot but also automatic report generation. For example, client needs to fill up a usual report to comply with some legislation, so we do this automatically with LLMs.

Assistant Orchestrator Engine

and what we've built very high level is we have this assistant orchestrator engine that is able to determine which the appropriate agent to call and it's kind of a mixture of experts so we get the external input to the orchestrator and it will decide where to go right and this is This has three main advantages for us and its specialization. First of all, we can define pretty well experts that can be fine-tuned for a specific domain. What an expert means?

Integration and Scalability

An expert is basically a combination of a LLM, specific prompt, specific set of tools, and documents that it can go for to retrieve information. This gives us flexibility to integrate new experts and scale up horizontally with more and more experts. And it's scalable by definition, right?

System Workflow

So this is how it looks like more in depth. and the flow goes as follows, right? So a user go to the Clarity web app and we will see it in just a moment in live demo.

We have an API and we have this assistant manager or the orchestrator, which basically is not deciding which expert to call. What's doing is polymorphism. It understand which is the more suitable scheme that he has to put on and it will go and you know play the role of an ESG risk expert climate and so on and so forth we have many experts and we have split it in generation experts and it's this is kind of if you were expert in ESG risk so ESG means environmental social and government and governance in the sustainability business

Query Processing and Expert Systems

So we have well-defined, as if you were talking to an expert, that will fit your needs. We have the unstructured piece that was the rag that I was showing you before, and then the retrieval experts that are the structure for the rag. And what we are doing is we get the query. We decide which expert to pick or play the role.

We do the planning of the query. and then get the information needed and build up the answer. And this is the more in-depth flow of the whole system.

Guardrails and Query Writer

And as I was saying, there's the query, and there is something important that I haven't mentioned before, and it's the guardrails. if you are trying to fool us to, you know, playing like you are an expert in, you know, telling the LLM, you are an expert in Y or Z and, you know, get information that is out of your... out of your zone right so if you can access this kind of information within the clarity ai platform we have these rails to if it's not compliant go directly to the end and if it's compliant um we go to a query writer

And what this does is, usually as users we are lazy, right? So we quickly write a query and we don't want to be very specific. So what we are doing is learning from all the queries and understand what the user wants when the query is really lazy. And we rewrite this query to pick the expert.

Query Planning and Response Generation

Then we do the polymorphism, we put the hat of the expert, and go to a query planner that what it's doing is planning all the tasks that it has to do to answer the question I may have to go to a document retrieve the information from this PDF that I was talking before and I may have to go to an internal endpoint to retrieve data so we do all of this in a loop and go to monitoring and then when we have all the information needed we go to the response generator

Live Demonstration

And with that, I will show you this in live because we are kind of running out of time. So hopefully you see, OK, this is the Clarity platform in production. And this is kind of what we've done. This is doing everything that I just showed you under the scene.

Platform Interface and Query Understanding

First, very silly question, what's doing under the scenes is understanding what the user wants and then plan the query and with that the task that it has to do is go to the endpoint that is giving you the overall score that is the one that you can see right here and this is not really useful but you can go to specific company and say is the company this one So for this, we have to go and I put this query, you know, deliberately because it's a complex one.

We have to go to the annual report. We have to go to the report that is publicly available in this company's web page. And we have to go to Clarity's API to get the data. And what it's doing is answering you after retrieving the data from the company publicly available report that you can just go here.

and it will download it for you so you can go check the information that we just mentioned there and also with the clarity API to get this data and also This one I just made up. Probably, I don't know if it will work or not.

But yeah, basically it's going to the, as I was showing here, what's doing is understanding the query is under the policies. We rewrite the query, we go to the polymorphism, understand that is the ESG risk or the climate that has to understand. we in this planner go to the vector database with all the documents splitted already and retrieve the information and go to the end user.

Conclusion and Q&A

And with that, do you have any questions? I guess we are on time, right? Cool.