Integrating an open-source AI financial analyst to a financial terminal

Introduction

Hi, everyone. Thank you, Winston, for having me. Thank you, Nathan, for the invite.

I come with OpenBB. We are like a financial terminal. And today, I'm going to show you how we incorporated an open source AI financial analyst into our financial terminal.

And most of what you're going to see today has been open source. And so this enables any of you to go into GitHub and play around with the project.

The Genesis of OpenBB

So first of all, let me introduce OpenBB. So we started as a side project of mine, started building it on my living room back in London when I had a full-time job. Basically, what I wanted to do was have access to a ton of financial data. and I wanted to have full control over what I did with that data.

I wanted to basically be able to streamline my entire investment research workflow. So for me, it was really important to be able to access all of these financial data in the same place and standardize the way I access the data so that I was able to basically run my entire investment research workflow regardless of the company I was looking for.

So I worked on this side project. I did everything in Python end-to-end, and I worked on it for two months. After that, I was happy with the overall architecture of the product.

And so I made it open source because my background is engineering, it's not finance. And so I expected people that have more of a financial background to be able to contribute to the project. And so when I made it open source, we went immediately viral. We were trending on VentureBeats. VentureBeats wrote about us.

We were trending on Reddit, on Hacker News. And we really raised in terms of popularity on GitHub. And today we are we have like almost 27,000 stars on GitHub and like we are one of the top products within the finance category or investment research category. And that's kind of how we started.

Overview of the OpenBB Platform

And today what I'm going to talk is going to be really based in terms of the platform that we've built.

So what is the OpenBB platform? So last night, I was working on a schematic to try to abstract all of the other products that we have to really explain in simple terms what the platform does. And so in simple terms, what you have is multiple data vendors. They all have APIs that have different ways to call data. They have a different symbology. They have a different type of outputs that are being returned. And what we do is we aggregate all of those within the platform.

and so at the beginning when I started implementing this I was interested in equity and crypto and so I started pulling that data but as soon as I open sourced it what happened is the community and that's why these like github contributors represent started adding their own like extensions they started adding fixed income they started adding forex they started adding options data and so the the platform grew a lot and so At some point, we had like over 100 different data vendors that were basically like standardized all within the platform. That meant that if you wanted to access like, I don't know, like income of a company, you do something like, you know, from OpenBB import to OBB and then the command would be OBB.equity.fundamental.income.

And so you could access a vast wealth range of data through that architecture. And then what we did is we had some relationship with data vendors. We paid for a premium API key, and we incorporated that into our enterprise product, which I'm going to talk a bit later in this talk. And so this is kind of what happens behind the scenes.

So when you use OpenBB, you'll do something like this, where you call the equities, like the category that you are in. Then this is a subcategory, and this is the endpoint, the function that you're going to call. Here you can see the symbol, the parameter, which is the same regardless of the data provider. We have the symbol terminology, and the interval is going to be the same onto all of them. And then here, you select the data provider that you're interested in. What we do is the routing to the right data vendor. And we transform these parameters into something that the data vendor recognizes. So for instance, some data vendors don't have an input parameter that is symbol. It's maybe called ticker. And so what we do is we transform that.

We call the data from the data vendor, the endpoint. We get the data. And then what we do, we transform the data. So the same way we have terminology at the input level, we're going to have it at the output level. And so what that ends up at the end, you're going to have something that we call the OBBject. And that's not a mistake. It's for OpenBB, OBB object. And that basically, that's an object that has all the data that has been output from any functionality. And then there's methods attached to it. So methods could be 2DF for a data frame. It could be 2JSON if you want like a JSON format. It could be 2string if you want a string. And another one that we added is 2chart. So if you want a chart directly within your Google notebook or you're running like a Python script, that's an extension that we've built. And it produces a chart immediately for that specific data. 1And so one important thing here is that we have this core foundation on how we process this data using these transform, extract, transform architecture. And then users are able to add new data sets on top, and even custom ones that they can keep private, because the foundation of the platform is really the architecture that has been built. And as I mentioned, it's fully open source.

Incorporating AI into OpenBB

Now, during the entire craze around ChatGPT, we started thinking, OK, we spent so much time around the platform making it easy for humans to consume. But the next step is making it easy for machines to use, in this case, agents, large language models.

So for instance, whereas I would access an income statement using this methodology with the 2DF at the end, Like these would be a lot of data for a large language model to process. And also the doc string of the functions are really important for a large language model when does function calling to understand what are the parameters available, what are the types of variables.

And so we did quite a few things in terms of the platform to make it easy for agents to use the platform as well. And so for instance, one of the things that we added on the obbject was the capability to do to underscore llm, which returns the data frame in terms of a JSON. And we stringify it to make it easier and use less tokens for context.

And then the docstring, we went through all of the docstrings of the functionalities. And we make sure that the signature, the parameters, the description was really, really good. And that improved a lot the accuracy of agents leveraging the platform.

Architecture of the AI-Powered Financial Analyst

And so we created an AI-powered financial analyst, which I'm going to show you the architecture today. And we actually talked about it at OpenCore Summit six, seven months ago now. And this is the architecture of how it looks. And so I'm not going to go too deep onto it. We're just going to go through it quickly because there's the entire video about it. And so you can check it after.

But basically, you've got the platform here. And then what we do is we convert each of the functionalities into tools. And so those tools basically are put into an embedding store and that like tools is basically the function and the signature of the data. So like say that you're looking for the fundamental income, we will put that into embedding store. And the idea is to tell an agent later on, look, you have access to all of these tools, which ones are the most relevant based on the prompt of the user.

Let's say that the user asks, what are Tesla competitors and which one is the highest market cap? What we do, we do a task decomposition step where basically we divide it into simpler tasks and we keep the dependency. So simpler task in this case would be, what are Tesla competitors? And another one would be, which one has the highest market cap?

We try to get an associated list of keywords for each of the tasks. On that associated keyword, what we do is we go into this tool retrieval. And so we have these keywords. And for instance, a keyword could be market cap. The other one could be competitors, could be peers.

And what we do is we match it against these embeddings that are on the vector store to understand what are the list of tools that have the highest similarity based on what the user asked. And what we do is we return this list of two, three functions. And so that is important, because when we send these simpler tasks that needs to be done to the subtask agents, they're going to have a number of functions that they can call. So it won't be hundreds like we have here, but it will be only the ones that have the highest match.

And so the agent then will have, OK, I have the simpler question that has been asked, and I have the list of functions that I can call. And so this will reduce hallucination because you are actually calling data in real time. So it's not something that the LLM has been trained on. It's actually you're calling an endpoint that provides data in real time that then

the agent can reply based on that and so what happens is that the subtask agents will basically reply to the to the simpler task and then in the end what you get is the main query you get the the the simpler questions and their answer so you don't need the tools anymore and then you get the final the final answer And we made this open source as well. So in this case, we're using OpenAI, but you can swap it by another model if you want to change the architecture, if you want to change the way we create the embeddings, vector store, everything. You can customize everything.

The OpenBB Enterprise Product

And so we've been building this AI financial terminal now for quite some time.

And this is our enterprise product. And basically, in a graphic, this is how it works.

We allow users to bring any type of data. whether it's in a database or a data warehouse like Snowflake, Elastic, or others, whether it's a PDF, structured and unstructured.

And then on the other side, you have a financial terminal that is very customized and modular. You can have multiple dashboards, folders, organize it. And you have an AI co-pilot on the side.

Now, we have our own copilot. And so we have a lot of workflows that make it really efficient when it uses data. So for instance, sometimes the tables are very large, and we don't just pass that as a context. We do text to SQL to make sure that the output is better.

We also have workflows. It tells you when it's grabbing data from the dashboard. Or if you put a PDF, it will give you the citation of where the data was used from. And so there's a lot of workflows.

5But having said that, one of the things that we allow users to do is to bring their own copilot. So let's say that you have your own large language model that has been fine-tuned or is doing rock within your firm. You can integrate that into our product the same way you can bring your own data. And so you can understand that this can be really powerful because we provide these end-to-end AI in a way, right?

And so this is what it looks like. And so the purpose of the conversation to talk today is basically showing, OK, we built this agent workflow that is open source. And I'm going to show you how you can bring it into this AI financial terminal.

Customization and Predictions

And so the use cases, there are many, generating forecasts for complex scenarios, uncovering potential trading signals on the left in trading to vulnerable positions, automate data gathering for regulatory compliance, and so on. So bring your own copilot is actually the critical part here.

So it was between Christmas and New Year's Eve. And one of my predictions was that custom large language models at firms are going to be a topic that are going to be discussed a lot. And so it was interesting at the time because this was, okay, I think that this is going to be the state of the world.

And today we have a lot of conversations with banks or big financial firms. And it is actually like a must, you know, they can't just simply use a product that is built on top of OpenAI. A lot of them are building on top of open source models or having sure that their data doesn't get away because their data is really important that it's kept secure and that it doesn't go to a data vendor that they aren't allowed to.

And so that was one of the predictions that we've done, and that's why we built that architecture early on to support that use case. And so today, if you will go into the OpenPB Copilot window, you can add your own custom copilot. This is what it looks. You basically will provide an endpoint. And that whenever you write the message, it will eat that endpoint. And then you get the reply.

And we made that architecture also open source. Basically, that means that I think I have a better image. IA Financial Terminal, we joined a custom co-pilot. And here it is.

Recap and Conclusion

So let's recap where we are.

So we have the platform here that has been fully open source with all the tools from all the endpoints that we have. We have the entire agent architecture that we've made open source, which you can actually use. It's literally two lines, three lines of code. And you can actually just use it directly.

And you just need your personal access token using OpenVV that leverages a ton of different API keys from these data vendors. And we made this open source as well. So this is the data layer, API layer, that integrates into our financial terminal.

And so it enables users to bring something like this, or any kind of open source model, or if you want like Anthropic or OpenAI, anything you can bring, because we made the framework to integrate Copilot into our product open source. And so what is the result? So this is the result.

And so this kind of concludes the discussion. And so what you are seeing here is that on this side, I have the financial terminal as is. And so I created a few tabs here above. I called it overview, financials, technical analysis. These are some widgets of data that are immediately available from OpenBB. But I could also have something that I had connected with my Snowflake data warehouse or a backend that I had built.

And now here, what you are seeing is these OpenBB agent copilot. So this is the open source agent that I built. And I presented at the OpenCourse Summit that I showed you earlier. And so what I can do here is that I can ask it, OK, check what our Tesla peers from those check which one has the highest market cap. Then on the ticker that has the highest market cap, Get the most recent price target estimate from an analyst and tell me who it was and what the estimate was made. And you can see that the copilot is able to give that answer.

And not because the data is here, unlike the most generic use case that we have, but because it's eating on the platform, which is fully open source and is eating on data from the data vendors from the platform. The conclusion of this is that we built all this technology, which is open source. And now we provided the framework for users to be able to bring it into a very state of the art financial terminal that you can fully integrate into your research workflow. And yeah, I think that's it. Thank you.

Invitation to Try OpenBB

And so if you want to try the financial terminal and the integration, you can go to pro.openbb.co to register for free. There's a 21-day trial. But if you want more time, just email me at dda.lobsch at openbbfinance or message me on LinkedIn or Twitter, and I'm happy to provide you with more time or onboard you and your firm. And you can also find more information onto our website. Otherwise, I recommend you checking out all the open source code. And I recommend you contributing to it or taking it apart and using it for your own use case. And I'll be happy to help you set up and get started with it. Sounds good? Thank you.

Finished reading?