Hi, I wanted to talk about MCP servers.
I'm Alex, and I've built a bunch of MCP servers.
And before I get started, though, I'm curious,
who here has connected their ChatGPT,
Claude, let's say, Claude Desktop, Claude Code,
to a third -party tool using either a connector or MCP server?
Anybody?
Okay, quite a few.
Has anybody actually built an MCP server?
Okay, a couple.
Well, I mean, this talk will be relevant for you, but I think it'll be relevant
for everybody, because my background is as an institutional investor.
So I recently joined a
firm called Ballyasney Asset Management, and I lead our internal build out of our software.
So
we build different tools for our investment team to help make them, basically help them with better
better informed investment decisions.
Before that, I worked for a technology company called
Ravenpack, where I was really working on AI tooling with many of the largest banks,
institutional investors.
And then before that, for the past decade, I've spent most of my career
actually as an investor.
And I think what's amazing nowadays is I didn't come from a technical
background.
I don't have an engineering degree.
But so long as you are sufficiently curious and
and persistent, you can climb the curve so much faster
than you ever were before.
I tried to kind of make a pivot into technology
maybe back in 2017 and was not successful with it.
And then, I mean, lately it's just been phenomenal
the types of things that you're building
or that you can build.
And so what I want to talk about now is
we're at a really, really interesting point in history.
And it's interesting to think that language models
have been out now for three years.
I think ChatGPT launched November 2022.
But if we think about the basis of the internet that we are on today,
I mean, it's really the hypertext transfer protocol,
which was the groundwork for that.
And that actually took 10 years of iteration
and developing of that framework
for how do we pass information between us
before we really had a wide adoption of the internet that we know today.
And I think what's fascinating
is we're at a very similar moment in history right now.
Although ChatGPT and language models have now been out for three years, the introduction of MCP, which is model context protocol, we're only one year into that journey, and already in the past just one year you've seen a tremendous development in the protocol of how these systems interact with each other.
So if you're not familiar what model context protocol is, I'll explain it a little further, but in short, it is how do we get agents, language models to interact with third -party tools, so that could be APIs, it could be interacting with the user in a structured way, or other resources that we might have.
And this is the basis for the Lennox level of intelligence that I think we're seeing in the world today
if you look at adoption of
Interoperable or agent frameworks MCP has gone absolutely parabolic in terms of the stars that you're seeing on github
Also, same thing with weekly downloads
So the big question is what is it and I kind of talked a little bit about that at a high level
but oftentimes when people you know are using and I often get you know, kind of
looks like this, where people kind of know what it is, but don't really know what it is.
So let's
get into it.
So in short, within the MCP protocol, you have effectively your agents or your servers,
which could be your chat GPT, could be Claude, could be Microsoft Copilot.
There's a whole range
of MCP clients.
Sorry, those are clients.
And then what you have on the other side of that is
is providers of data, resources, other things, so these are data providers, APIs, and essentially
in between you can build these things called MCP servers which will take the data from
whatever API endpoint you could have, maybe it's Google Flights, maybe it's Airbnb, maybe
it's something like that, organize it and feed it to the language model in a way that
the language model can call it and get it back to them.
Now in terms of what this looks like, if you've ever used either ChatGPT or Claude, we can
can just take a look, and so I'm going to flip over,
because this demo, so what I'm going to do
is we're going to use an MCP server to produce
an investment grade report.
And in doing so, it takes about eight minutes to run,
so I'm going to get started on this early.
So if I come here, hit plus, there's
various different connectors.
You can connect it to Google Drive.
I've actually built another one for Google Sheets,
which is very interesting, but I'm not
going to have time to show it.
But this is one that I built for the company I previously
worked for.
Now, nothing here is anything that we
do within Belyazny.
None of this is investment advice from the AI or me.
But this is something
that I've done personally prior to joining and something that I was going to run through.
So there's different sort of prompts I've built in here that are sort of pre -can things,
reports that we could generate for any company we like.
Now, to keep it simple, and because I
know the timeframe of generating it, I'm going to go with an earnings preview.
And in the next
So for the next three days, I know the companies reporting earnings, at least the large U .S.
companies, include Micron, which is a large, let's say, memory service provider, Carnival Cruises, and Nike.
Can I just do a quick poll and see which one we would like to run?
So Micron, Nike, Carnival Cruises.
All right, we're going to go with Nike.
So I'm going to put NKE into the prompt.
And this is a prompt that I've built out in the previously and is stored in my MCP server
So I'll get into the how that works and so you can see here
There's like a longer prompt that's just being fed and then what I'm also going to do is I'm going to give it a little
more instruction, so I think I had a little
additional
Instruction on what I'd like it to do
Which is build a detailed report using the big data tools
tools, that is the server that I'm using, and then render the final report in a PDF
using the equity report builder skill.
Skills are really cool.
They only launched about a few, maybe two months ago within the Claude framework, and
it's effectively like MCP but only specific to Claude and has some advantages.
Great.
I'm going to hit run, and this is going to get started, and we're going to come back
to this in a little bit and see what it produces at the end of this.
In the meantime, let's talk about the protocol itself.
So the core parts of the MCP protocol consist of tools, resources, and prompts.
Now prompts are basically what I just showed.
You can have pre -existing prompts that you store within the server that whenever any
user interacts with your MCP server, they can just click that and run it without needing
to necessarily type it out.
Tools are how do we interact with third -party APIs and gather data and return that to the
agent in a way that's useful to the agent that can produce something worthwhile.
And then resources is effectively a remote file system where you can store other content
that might be useful for the agent to be able to crawl through and figure out.
So this could be instruction manuals, it could be guides, it could be other things that you
might want to have sitting on the back of your server that then your agent could interact
interact with, read, understand, and then do something a little bit more complex.
Now in terms of the servers I built, the first one I built was just on resources.
So what that did was basically it went to our API documentation, took all the documentation,
and then helped coding agents understand context so they could, you know, essentially write
calls to our APIs more efficiently.
The second was just tool calling.
Most MCPs you see are primarily using tool calling.
And then the third basically includes all four of these things, which is then elicitations.
Now, elicitations is a very new part of the MCP framework.
It was only added in June this year.
And what it allows you to do, so the two additions were basically elicitations and sampling.
And so elicitations allows you to structure one of these servers that it will trigger to ask questions in a structured way back to the user.
And what that enables you to do is to build this sort of very human in the loop feedback
process where if, let's say, it wants to do a check, it will check with the user first,
and then the user can say yes, no, okay, I like that, I don't like that.
And if we have time, I'll show.
So if you're building an MCP server, which at the moment is just one of us, but I hope
is more and more, because honestly, it's a lot easier than it sounds.
And actually, there are skills within Claw that allow you to do this in just a couple
clicks that you can spin up a server yourself.
But when thinking about how to spin up a server, there's some big considerations.
And so the first, and I think the most naive way that I often see people building these
things is single responsibility, which is, okay, for us, we have 30 endpoints, right,
30 API endpoints, and so we just build one tool on top of each endpoint, and that's easy,
right?
right?
It's actually, it's really not that simple.
The reason for that is because when you present
an agent with many tools, what happens is the performance degrades logarithmically.
And particularly after you get over 40 or 50 tools that your agent has access to,
it really struggles to pick the right tool.
And you see no improvement with iterations.
So if
an agent tries again and again, it'll still get it wrong.
And then also that you're seeing some
some systems begin to limit the number of tools that an agent might have access to,
to kind of offset or prevent from this particular issue.
The second approach, which is very novel, we've just seen this approach in the past
maybe two or three months, is what's called the layered MCP approach.
And effectively what this is, so this is a company called Block, and what they did is
they built a MCP server that only had three tool calls.
So the first tool call, basically the agent would be able to look over and read the documentation
documentation of the API for the block API documentation.
The second one is it would decide on the right tool
and then configure the parameters
for the particular API call.
And then the third, it would go call that.
And what this enabled them to do was basically
have over 200 endpoints that a single MCP server
with three tools could figure out how to call.
And I think that's a really, really interesting thing.
There's a couple of really good articles out there about it.
One is on Cloudflare.
If you are interested in this topic,
I'd say check out the blog post called Code Mode
on the Cloudflare's blog.
But the problem with that is essentially
for any call that the agent wants to make to the server,
you basically got to do three back and forth
to get to what you want.
And you're also having the agent then write code
that is kind of being thrown away in the moment.
And so it just is, it consumes a lot more tokens
tokens when you're building a layered MCP server as opposed to a single responsibility.
And I think having built many of these things, what I find is it's important to strike a
balance between these two ends of the spectrum in terms of how you think about your MCP structure.
And I think one of the things that I found works best for us and what we build is instead
of as developers you often think about, okay, we have these endpoints, now we've got to
build MCP tools.
tools, but what I found works much better is actually think about the user experiences
that you want to have and build down to that, right?
So instead of having a single API call that calls a single tool, you might have, okay,
for me, I want to get a comp sheet, you know, and that might require us to get the market
cap and the enterprise value and the EBITDA for five or six different companies.
That might require a lot of back and forth API calls for us to make.
But that is still a single user story, it's a single experience, so we would wrap that
into a single tool.
And you see this with how GitHub has structured their MCP server, which they have a single
tool that will create a branch, commit to the branch, and then push to that branch all
in a single tool.
So it's wrapped up an entire user experience that GitHub might want to have into one interaction
with the MCP.
be.
Now, let's check out how we did with our demo.
OK, so we're looking at a PDF of an earnings preview
that we've just generated in live time.
This PDF is, let's just download it
so it's a little easier to see.
And before we get to the actual report itself,
it's good to go back and check the trace of what actually
happened in this process from when we gave it the prompt
that we wanted to run.
And then what did the agent actually go do?
do.
So first, it understood the specs of the prompt.
Then, let's see, can everyone see that?
Should I zoom in a bit?
Yeah, great.
So then it ran.
So essentially, what I did here is this particular server
that I'm running this with, the one that I've built for it,
consists of three tools.
And so as opposed to the whole thing I said like 30 endpoints.
So the three tools are effectively find companies.
So whenever I mention a company, it
It will go look into our knowledge graph
that we have in the big data server.
It'll find the ID of that company.
From there, we can use that ID
to then fetch other pieces of information.
So the first tool is just checking the knowledge graph
for the particular company.
The second one is then getting structured data, right?
Oftentimes in these reports, we want the stock price,
we want the market cap,
we want the recent balance sheet and income statement,
all the different little bits of information
that go into a relevant report that keep it grounded
it and keep the agent from hallucinating.
So the second is basically getting structured data.
And then the third is running search.
And so a core part of what we provided at Ravenpack was a search API for finance.
So we had taken hundreds of millions of documents across news and filings and transcripts, and
we had organized that into a vector store that we allowed institutional investors to
basically query and search for.
So then what you can do with search is you can have an agent that can say,
okay, I'm looking for the latest developments for Nike.
I'm looking for the latest developments for these things.
And so just using those three tools that were each built on many different endpoints,
we were able to put this together in this way.
So the first, we can see that if we check it out, okay, it found,
I found Nike, and the ID here is D64C6D,
which is the ID that they have on the back end of their system.
system.
Then once it found it, it decided, okay, I'm going to go get this data, right?
So it went
and got the current price, the market cap, all this other data that we might want to feed into
the report.
And if we jump to Google, just to check that this is right, let's just go Nike stock
price, 6686, 6678.
I mean, we're pretty much bang on, you know, maybe 10 cents off.
And in terms of
the range, we've got that as well, right?
And we know these are right because it's calling an API
as opposed to just generating it from within the model.
Then once it found that, it then went off
and did a number of steps here.
So it's put them all into the single, oh, this is just the
generating the report.
There should be some searches that it did.
Oh, eight steps are all
mixed in here as well.
Great.
Yeah, okay.
So get company tariff sheet.
This is the data call that
it did, then it ran a search, so let's see the search that it ran, okay, Nike earnings
guidance outlook CE turnaround strategy initiatives.
So clearly here there's something about a CEO turnaround that it found from reading
the documents, and then it went and got other pieces of information from the transcript,
from the filings, from other pieces of relevant information in our repository.
It then did another search, did another search, so it did a series of five or six different
different searches for different pieces of information, and these are all different chunks
that our search API returned back to the agent.
Once it got all that information, it organized it into a report and then used a final skill
to then build out the PDF.
So here we have an exact summary of what's going on ahead of earnings.
It's saying the turnaround under Hill is showing early signs of progress, right?
So I guess this is the new CEO.
Giving us detailed metrics for how it's performed in China.
Revenue estimates.
then it goes through forward estimates of what are the expectations that analysts are currently
modeling for the next few quarters then it gives you prior quarter performance of okay this is how
it performed versus expectations and what we saw in terms of the growth rate and then if we keep
going it'll give us latest developments so we got leadership restructuring they've got a new I guess
win now strategy there is big questions right now about the impact on tariffs on the company
And then key metrics to watch, which it talks about either the wholesale momentum, their margins, inventory levels, Q3 guidance.
And so this is basically touching upon a lot of the key issues here that a sell -side analyst or a buy -side analyst might want to look into.
And from my experience, I wouldn't say this is exactly at the level of a professional investor yet.
But this is also something that we built six months ago, and the tech has come a long way since.
So, with that, I can flip over and also show just a quick thing on how elicitations work,
which let's just do this.
So elicitations are exactly the same thing.
So this is just a different client, right?
So this is just cursor that I'm using instead of Claude.
And if I say find me the company ID for Nike.
The reason I'm doing it in a different client is the new feature of elicitations within
the MCP protocol is only supported by some clients.
So Claude, ChatGPT, other largely available clients don't yet support the elicitations
feature, but GitHub Copilot, Cursor, and many others do.
And what that essentially does is so the same query that we ran here, the same tool call,
all.
What it'll do first is, great, I found Nike, this company in the consumer goods sector.
Here's
the ID.
Is this the right one?
And I can say, no, that's not the right one.
I actually meant
something different.
And then it'll say, okay, did you mean any of these other companies?
So here
it's talking about this NKE, which I guess from NKE, I could have typed that.
It's talking about
Anki Austria, and the industrials for Sir Bearings, etc.
So, right, so this is a simple demo,
but the idea being that I've architected this
in a way that, okay, first check with the user
if this is the right piece of information before proceeding.
Once you proceed, okay, if they say no,
so you're kind of building a decision tree here.
If they say no, then give them some more information.
If they say yes, give them this information.
And you can keep building these layers upon layers upon layers,
and then when you put that in front of a user,
user, then you can build something where it's not just like the AI went off and built something
for you, it's that you built it in a very guided way with the AI as kind of a co -pilot
with you.
So now if I say, okay, yes, that's it, so great, then it continues.
Great.
And that is the basis for the demo.
And yeah, with that, if you want to connect or chat about the topic, happy to take questions.