Hello everyone, my name is Kyle Burke.
I'm head of data and AI at a company called Bondex.
We're building social economic networks for the crypto space.
I worked in crypto in 2017 for a company called ConsenSys,
which means I love working on the bleeding edge of technology.
And so now I find myself building AI agents and AI systems.
And so, Langchain is something that we use at our organization.
All the data and AI members are quite familiar with it.
I think it's a super powerful tool in open source technology that really helps you work with generative AI, large language models.
And we're going to go through a really practical example of what we can do with it.
So the example that I'm going to provide is a home maintenance assistant.
So imagine in a not too distant future, you know, we have an agent that helps us manage the home.
And the reason I came up with this is because I recently bought an air purifier and it came with a manual.
i was like man who wants to read manuals these days um i did read it but then i also you know
as you'll see did some interesting things with it so uh first why are we talking about lang chain
why is it so special i think it's really interesting because it creates modularized
components that you can use and reuse in your general ai setups so for example if we wanted
to switch between and if i need to make this bigger please just let me know we can easily
switch between an open AI model to a Gemini model to an anthropic model and
it's just one line of code so I'm sure you guys have seen Gemini just released
I believe it was three recently that has a lot of improvements you know like
super high output scores if I wanted to switch from you know GPT -4 to one of the
Gemini models all I need to do is change one line of code and everything else
works that's incredibly powerful and it's not something that you get if
if you're working with OpenAI API directly.
The same thing could be said for vector stores,
which we'll get into a little bit later.
FAISS is something that works locally,
so it'll be running on my computer.
Pinecone is something that's on the cloud.
So as you shift over and your code base gets more mature
and you want to be able to serve in a production capacity,
you might change the vector store
and you know that it works
because all of the helper functions and modules
work the same across many different vector stores.
So, I'm actually going to go through this thing live, and this is a very engineering
way to do it, and that's because that's the type of person that I am.
But imagine that most of the time when you see text on the screen, this is you interacting
with a chat bot, and there's a UI, everyone's familiar with chat GPT, but it won't exactly
look that way here.
And I'm going to try and do all this live, so we'll see.
sometimes there's hallucinations and things break but like I don't know I
find that kind of exciting and fun so I'm going to restart the kernel and we
are going to run this live so I'm just doing a bunch of imports here this is
how technology works behind the scenes this is the manual to my air purifier
we're going to load it into this notebook and I'm going to print it out
so you can see that yes the text from this thing is actually the text of the manual and then we'll
be able to use it later on hopefully as these cells run they don't take too long but if they
start to then we could just switch to something else so lang chain in its simplest form like
what does it do it's really just a way that you can prompt generative ai systems so here's an
example where we're just setting a system prompt today is the day's date you're a helpful
maintenance assistance and I need you to read appliance manuals extract the
maintenance tasks from them and then help me maintain a schedule for keeping
up to date with those things here would be the message that I'm sending to the
generative AI system in the little chat window hey here's the manual for this
appliance manual and then the manual text so line chain is really cool
because you can use variables right so like imagine it wasn't an air purifier
fire and it was my HVAC system and I just dropped the manual text in there it would be able to
function in the same sort of way right so I'm defining the the message where I'm setting today's
date I'm stating what the appliance is and then I'm putting the manual text and again the manual
text we created up here here's the first 500 characters of it so if I were to run this
I'm actually you know live hitting the LLM model right here which is GPT -4 and hopefully within a
couple of seconds I'll get a response back where it tells me what are the maintenance steps that
are required of the system and judging by how long this is taking we may need to skip a couple of
these so here it gives me like a nice little printout of you know here's your maintenance
tasks I'm gonna expand this so that it's a little bit bigger gives me four things that I need to do
here's your maintenance schedule you know over the next few months of things
that you need to do here's some tips that go along with it pros and cons like
this thing is actually like quite nice it's super low code it took like 15
lines of codes I get like really nice output it's a reusable prompt so again I
can enter different manuals different product lines and I should be able to
get some nice output like that but like the output is good for a chat interface
interface, but what can I really do with that?
If I wanted to integrate this into an application
where if I have my home assistant, maybe I want it to schedule notifications so that I get notified
on my phone when, oh, hey, I need to change the filter.
It's been two weeks.
I don't want to have
to remember when's the last time I've changed it every time.
So having a system that can keep that
up for me.
So this is where structured output comes in.
So structured output is when you request
something of a generative AI system and you tell it what is the shape of the data that you would
like back so instead of just saying hey like give me the text of the things that we want
we we give it like what is the structure that we want so we want for each maintenance schedule item
we want the task we want how often it happens we want the date of like when we're creating it
and when was the last time that it was completed and then any extra notes that I need to to go
along with that.
And we are then saying, hey, take the same maintenance prompt as before.
So this is where the modularity comes in.
I don't need to redefine anything.
But instead,
just give it to me with structured output of this thing, where there's a list of items
that it can run.
Again, I'm passing in some variables.
And then when I run this, we will
see a printout that's in this general shape of how everything looks.
OK.
So it's giving
me the four things.
You need to clean the pre -filter, you need to do the pollutions
in there, blah blah blah.
I'm printing this out so that it's like in a string, but like
I would be able to use this to then, you know, store in a database or like perform some functions
like hey, it gave me four things, why don't I add calendar notifications, right, so that
I get notified of when they're happening on my phone.
An example of like, you know, a
pretty straightforward like database might work like I could write to a text
file right so if I run this it's gonna save it to here boom I have this text
file and then I could perform updates on it so let's say for example here's the
maintenance schedule for this appliance schedule please provide me with a
numbered list of tasks that I need to complete as of today's date so I'm gonna
build this prompt i'm going to pass in the chat model i'm not using structured outputs this time
i'm just saying like hey i want to interact with you but i'm changing the date to be three months
from now so imagine in three months i haven't necessarily done anything and i'm like oh hey
what am i supposed to do and this thing's going to respond to me in plain text hey you need to
clean the pre -filter and you need to clean the pollution sensor where is it getting this
information from it's getting this information from this text file i'm asking it hey here's
the schedule and the llm can figure out what are the things that i haven't completed based on
today's date right imagine this is today's date we're living in the future
here are the things that you haven't done yet then i would be able to say
okay i was able to complete the cleaning of the pre -filter replacement of the hepa filter
again I'm just building you know like the chain of messages and then I'm gonna ask for structured
output of the maintenance schedule so I'm saying hey basically can you just update this last
completed date so that you know like I can then save it in the text file and I know that it'll
work properly and so if I run this we should see that this one has entered last completed
to date for the pre -filter, which is here, and then the HEPA filter, which is here.
I did not
complete the pollution sensor, nor did I complete the deodorization filter.
Pretty sweet, huh?
Like
I got, you know, somewhat of a database going where I can manage the tasks that go along with
this and be able to update the system as is.
But like, you know, it's cool if I like just want to
to do the maintenance of the thing but like what if what if I like don't know how to complete some
of the tasks on it right like it just says yeah remove the dusk with a vacuum cleaner or like use
a use a I don't know I guess all these are fairly straightforward but like what if there's some
confusion around the thing that needs to be done right I have an HVAC system that's like inherently
a bit more complex and I don't know how to do those tasks well this is where vector storage
and something called retrieval augmented generation comes in it's somewhat of a
complex topic essentially all words or parts of words are reduced down to
numbers where you can imagine the similarity between blue and red is much
more similar than turtle and table like it uses the similarity of those words to
then get information from source documents and then return it back to you
So, again, we're going to use a very practical example of this.
So, oh, I'm realizing that I actually deleted a cell, which is not good.
But let's say, for example, we have this retriever, right, which is the, oh, man, I actually, like, totally removed something from here.
This is what you get when you do things live, people.
um i'll take that as a compliment um and this is going to be tough because it's not here anymore
what can i do what can i do um that's the thing i want to do uh no i'm gonna have to roll back
to something oh this is brutal um okay we just went backwards in time
forgive me but I'm gonna pop this open again and we are at here okay so this is what we needed
maybe I could just copy this and go forward in time again and then we'll be okay okay thanks
for sticking with me people I think we got it now okay retrieval augmented generation we are
loading this manual again right I'm actually gonna run this because this is gonna take a
bit of time we are loading this manual we are creating documents um we are splitting them into
chunks so this is because vector storage doesn't work if you give it you know the lord of the rings
novel it's you know much better in like bite -sized chunks so can find relevant things so we say hey
split it you know by like a thousand tokens and have them overlap a little bit so that you you
know if information is cut off you you can get like two sources of it that are relative to each
other we use open AI embeddings again this vector storage is local so it's
just happening on my machine and then we're saying return us three documents
that are related to whatever question we ask so the prompt here we're saying can
you please answer the question based on only the following context the context
comes from the vector store retriever the question will be there and then
we're like asking the generative AI system to give us an answer so the
The question is, what is the warranty period?
Warranty period is two years.
It answered it, right?
And we can actually look to see what documents the retriever
used to get that information by, again, invoking the retriever.
And we can see here warranty warranted for two years,
warranty card limited to two years,
and document three, some other stuff about warranty.
So it's pulling directly from the source
in order to figure out where the information is coming from.
Again, super incredibly powerful system.
But the only thing that it can do
is answer questions based on documents.
It can't necessarily go off script very well.
It can't do any of the maintenance tasks
that I wanted to do before.
Would anyone care to guess what the solution is
is for this agentic AI with tools so I'm initializing a database in the
background as a bunch of code we don't necessarily need to go into it but we're
gonna go through some logs so that you guys can see how this thing is
interacting with it we have a bunch of functions that are being run that are
related to the tools themselves so I have a couple things that are happening
one is I just need you to record a maintenance event that's the thing that we saw earlier right
like create a record of a thing that needs to happen at some point in the future or read all
the maintenance events that need to happen those are all stored in the database update a maintenance
record right so like I completed it and then perform a search of the manual this is the vector
store thing that we saw previously so I can take this tools list include all those four things I
I have a system prompt, which is slightly more complicated
because there's a lot more happening in this.
And then all I have to do is say,
hey, I wanna create an agent.
I'm gonna use that LLM model.
Here's the tools that I'm providing you.
Here's the system prompt.
And then everything else just like kind of works.
Again, we're doing this live, so it might not be perfect,
but we're going for it.
I'm saying, hey, here's the manual.
Can you create a bunch of items for me?
Here's the database.
database is currently empty if I run this it's gonna take some time to
process and a couple of these just so you we get them loaded up yep so this
thing is processing right now great so I can see that it's actually hitting my
database and it's inserting it's performed a couple inserts and this is
the message that I would get back when I'm in that like chat experience I've
successfully created these maintenance tasks for you one two three four you can refer to these
tasks later if you would like if i go to my database and i refresh i don't know if i need
to like make this a little bit bigger i actually don't know if i can there we go great i have four
things here they're all associated with my appliance they have my four tasks i have the
date that it was scheduled and then the date that it's due and i can use this to interact with my
system so I'll make this smaller let's say and I'm three months from now what
are the maintenance tasks that I need to complete well what's it gonna do it's
gonna grab stuff from my database and then it's going to return to me a
readable like understanding of what I need to do I need to do these two things
I need to clean the pre -filter and I need to clean the pollution sensor you
know we can also look into the like what happened on the inside itself it
receives my message it then pulls all the related information it's actually
here from my database you can see tool call so it called the tool read all
maintenance got all the information and then that gets spit back to me kind of
of like interpreted by the llm so here's a more like complex use case what if today in three
months i tell it hey i clean the pre -filter but like note that the water was the best way to clean
it and you know i had to let it dry before reinserting it back in also like i have a
question like there's a filter alarm like does it make some noise when i press it like very complex
query that's happening right it's not super straightforward but you know
similar ways we're gonna like look through all the steps that are happening
it's finding the maintenance records it's inserting one it's updating a
maintenance record and then it's telling me hey I marked this thing is complete
we can see also in the list of messages that it received my message it performs
form some lookup.
I updated the maintenance record.
It then got documents from the source
material, used those to then create a specific response to me.
And if I go to the database,
we can see this clean pre -filter was marked as done with some notes that I provided it.
And then it created a new one for me that I still need to do in some future time.
Pretty cool,
huh so this is like you know something that we can imagine the future looking
like for all of these things that we need to interact with you know we're
gonna have agents with specific tools to help us do those things