yeah I guess I first I should present myself because I guess it's really
important because I'm the mathematician I've got two PhDs one in mathematics and
second in applied mathematics and I started like nine years ago work on AI
applications implementing AI so I'm a AI programmer I worked with companies like
CaixaBank, Banco Sabadell and big companies here in Spain and during the
last six or seven years I've collaborated with the huge research community here in
Catalonia to develop the Catalan large language models which actually
executing here in the Barcelona supercomputing center I don't know you
have you been there it's a great place I'd really recommend they have the free
excursion there so it's 1400 people more or less who's working on that and it's
just to create the Catalan language speaking model so I'm the person who
specialized in national natural language processing it's the what how we call the
models that can speak so we do it natural language processing which was
was born right before all large language models.
There have been the techniques that we used before.
But with the race of generative AI,
it's now all about the language model, and how they work,
and how we can use them.
And we see that they are really quite powerful,
and we can do a lot of stuff.
And I'm really grateful for two presentations
that we listened before.
We got kind of background.
and I guess I'm not the best speaker because the technical person has cannot
do that like honestly but I do have answer for your questions that you've
asked before well let's at least the technical side of how it works and how
it protests your data so I guess if you would like to repeat your questions up
the talk I'm I'm really excited to answer you to that and I would like to
speak about the validation of the AI agents it's actually what is the main
concern it's how we can validate that it correctly process the data and the
answer is trustworthy and simply correct that it's some stuff is not omitted that
that some stuff is not invented and so on.
So first, it is really, we have 15 minutes
and I can talk about that for hours.
It's a huge topic, yeah, it's a huge topic
and we try to focus on the generative AI
and I would like to just main idea
how the large language models, how they are working.
like the the idea so they just kind of probabilistic tool to predicting the
next word like nothing more than that and so there are like different class of
huge take of language model which has nothing to do with chat GPT chat GPT is
powered by the language model it's like the base model who the only task of its
life and the only thing that it learned to do it to generate the next port okay
better or worse so it's somehow why it's probabilistic it's somehow like your phone
uh text suggestion works when you have the the phone then you have then you're writing something
and then you appear you have the three words that appearing here so uh you have automatic suggestions
and then you can select one of them or you can type whatever you want to do and so when you see
that the chart can be precise, balanced, or creative.
So it's exactly which word you will normally choose.
At the middle, it will be balanced, always to the right, always to the left.
It will be just this is the difference between all this model.
But the main idea, and still it remains the same, it just probabilistically will select
the next word.
So you should not expect from the main model nothing more than that.
So it's the mathematical probabilistic tool, which is obviously can generate whatever it
wants to do and with the random, so you can ask 10 times and 10 times you can get slightly
different answer.
And this is, it's not the error and it's not the kind of the recent achievement and then
we will expect in the future the best performance it's the current it's we say it it's architecture
of these models and they will not become better because it's there like nature they do it by
nature and they will always do it by nature so you can see that paper that calls impossibility
of automated hallucination detectors which means that it is impossibly to create the agent and it's
that it automatically checks itself by the wrong answers okay it's it's like
scientifically proved that it's not possible okay so we had already see a
lot of yeah harm that that a lot that we've seen and a lot that we didn't see
because the company would like not to share the fails with the AI but for
For example, the government in the news, that was the big news, the government should resign
because they put in the production the not verified AI model and it was a huge scandal
with the AI.
So it's kind of the real problem.
So just for you to know how we can basically make our language model work better for us.
so chat GPT which is powered by language model they try to create the chat assistant that can
speak to you that can remember the conversation that can remember some data from your conversation
and that they can it's not generates only the words but it's it's me the conversation which
is slightly different and they also you have another models that can do excellent mass or
or they are very good in solving human genomes.
So it's different, like task -oriented AI.
So how they achieve the high performance
in this particular applications.
So we basically have three methods here.
So the main language model, we call it pre -training,
it's where we go to the huge super center
when you get all the data from all the world
and then you get your huge model.
this we don't touch because this is the we have very few centers that can really
produce long large language model and usually not any company has potential to
create its own language model which is because the amount of data that any
company the biggest company can has it's really small comparing with the the data
that is needed to pre -training so the data that is needed to pre -training is
it's all internet, you just take all the internet,
you put it inside the model,
and this will be the model, the pre -training.
So after pre -training, model can speak
and can generate the relevant text.
So the supervised fine -tuning and reinforcement learning
is that the two things that normally is doing after
then you have the base model to get the better performance.
And it's really, really similar that we have,
we can compare it with a school.
So the pre -training is we call it
stochastic parrot.
It's when you have a parrot and it can just repeat what it
heard from you.
So this is what he learned from the Internet he can repeat
somehow that it seems like it's it's conscious speaking.
Then the supervised
fine -tuning is when you have the teacher and when the teacher can explain you the
solution and then you can you can just learn the method that how you can act in different situations
and this uh and and then you if you someone ask you to solve the mathematical problem and before
it was explained to you then then you will know how to do it and it calls supervised fine tuning
and it's all guided learning and this was open ei is actually do they have the southern of
specialists who actually show the methods of doing different stuff such
that the model can know the method if no one explained the method so how you can
pour the water into the glass so it's the physical phenomenon that we do
because we somehow know how to do it from experience but this experience it's
not explicit in the data so they need to do some to have someone to explain that
that you need to take the glass then you need to have I know the second glass
which is empty I need to close this but so this is kind of how it works and then
you have the reinforcement learning it's when the one the same example and it's
cool it's when you have when you have a lot of tasks you really have the book
you have the exercises and then you at the end of the book you have the answers
you don't know the method but you have the task and the answer then you do your
method you get the answer then you go to the end of the book and you compare the answer your answer
with the correct answer it said how you improve your methods by yourself without needing the
teacher because you can create a method or you can adjust your methods based on the uh based on the
answer so it's called reinforcement learning and this is another method that chinese models using
very much comparing with the opener which is very expensive so to have the human to explain the
method is very expensive to have reinforcement learning it's very cheap and this is difference
of the approaches and so so that we have different chats which is deep seek and the opening a chat
that has different performance and they have different cost of this fine tuning and so they
are just basically because they use the different methods of like get better performance of the chats
itself why i explaining all of that we don't have basically access here we've already used
uh any model that you prefer i do charge gpt or cloud i can use clothes for the for the example
and the main difference that now we have that we can actually um influence on the performance of
of the model, it actually goes after all this model mastery.
When you get the model and then you can give it tools.
What I mean by giving tools,
it's the famous model context protocol
that's been introduced to before.
It's difficult concept and it's really easy to understand
Understand?
Because the models, they communicate with us
using prompt.
Yeah, we give instructions and they give us answer.
I have five minutes left.
So I do it very fast.
You have this protocol between the models
that you only can do the instructions.
So you only can write the text.
and it's really really difficult to write these texts correctly and in the
standardized way such that the model always can give you coherent answer so
Anthropic and OpenAI they've been introduced the specific model context
protocol it's a protocol just to communication with the language models
which is standardized and always gives you the same answer having these
protocols it's really easy and it's really robust to use your your document
or your notion or your Gmail or whatever because it's not the api call it's not
the database connection which is used with a with the SQL language with
complex language it's just the the protocol that's that it's just kind of a
prompt but it's not that it's the the standardized protocol and so this is the models that we have
now from the open ai and tropic and and i guess the all the models that we have now they are
trained to use the tools so all the previous example that that we saw so they like has it
inside their um their capabilities so they know how to use this protocol to call the different
tools you just need to specify the tools for them it's okay it's great but you
saw the example previous example of the only comments I have it looks great and
you should use it and you should try it the Claude the thing is that when you
said send the email you have zero control of what is really going on there
so it completely autonomous agent that that can use the tools that can use
these protocols but you understand nothing basically it's black box for you
you.
How you could affect on that?
You can create your agent or you can create your workflow
which will use this automation or this language model call and some moment when you control
where you want to do that.
In this case you have some question, then the agent can understand
the question and can call the database or can do whatever you want and then
then structure the information and generates your answer.
What is the
difference?
The difference is that this flow is yours and you can get the input
and the output of each component that this flow contains and so this way you
can control like each point of the reasoning of the of the model and I
guess and this is the main clue of the validation of the agents because here
you can basically validate what is going on and I'm going to show you really fast
how it looks like and you can find the different platforms where you can do
this really easily when you can create your agents like really really fast and
really just with no code expertise at all so it's just with this drag -and -drop
interface when you have different components in which some components is
just automation and some components it's the AI call tool when you can have the
so if you need to generate email you use any model this base model that I said
that can use tools that can generate can do whatever and then you do when you can
select any model that you want then you can give the system prompt that this
model will use and then it's you're good to go you have your flow and then the
important stuff here probably two things you can automatically validate it
because you can you can put your validation set you can see I have the
set of questions that you have to answer like that so please validate it for me
and this can be always automatically validated there is a tool for that and
the metrics for that and you can get the and you can get the metrics for that
also you can monitor that what what I mean by monitor is that you can get the
history of how it was working but then you have the trace is that is that what
Is that what I showed you?
You have the access for each component, the input and the output of each component, and
then you have the main component, which you probably need to know for the validation.
It's the human in the loop.
It's the concept.
It's all what I'm saying, all the concepts that I'm introducing you.
It's the current vocabulary that is used for any platform.
so the human in the loop it's the box that allows you to stop the process and
go for the human validation if the agent is not sure that the quality is
sufficient to sending automatically the email or processing automatically your
data so how it works in the practice so you generate some email if it's not
sufficiently good email it goes to your inbox and it will show you there that
you have to validate and then the email will be shared to your account and I'm
in one time yeah I'm finished
thank you