How to build a private chatbot - Sakshi Pandey

Introduction

So the real question is how do they actually leverage and benefit from the capability of generative AI while making sure that their data is secure? And this is what the solution is which I'm going to be talking about today, which is private GPT.

So this is a solution which we developed at ANOT, and this basically helps the users to upload their document into our platform. And after uploading the document, the users can just directly chat with that document in the system. They can ask the questions, and they can receive the answers from the documents on the basis of their size that they want.

Architecture and Privacy Mechanism

Now, how is it private? So let's deep dive further into the architecture and what makes it actually private. So the privacy mechanism comes into play because of the structure that we've already created here.

Secure Data Management

So once you upload the documents, we use MySQL vector database, which is already known for its global security and retrieval capabilities. And to analyze the data, we have the large language models such as LAMA2 or JVD4ALL, which is also being stored in your local machine, which makes sure that there's no data coming out of the system and everything runs on your own system. That is how it makes it secure.

We're also using Fez, which is basically used for creating high-dimensional vector embeddings, and this is primarily used for the speed as well as the accuracy of the results that you get while you're chatting with this overall system. Moving on.

Task Types and Benefits

So, let's discuss about the task types. So, how is it beneficial? How is it differentiated?

Key Features of Private GPT

So, what makes the private GP development is primarily you get three things.

Firstly, you can just upload your documents. So, all the financial institutions, they can just upload their internal documents in this particular system and they can ask the questions to the system itself based on their internal documents. So, they don't have to spend lots and lots of time, like the financial analyst, which they spend a lot of time retrieving the right information out of the documents, they won't have to do that.

Secondly, we've also made it possible by using SEC's Edgar's API that the documents that the financial analyst use, they can get the data on the real-time basis because of this API connection. So all the company filings, the tickers, they are actually being retrieved in real-time.

The third thing which I want to talk about is MySQL. So let's say the company has the database of the financial documents in MySQL. They can just directly integrate that with private GPT and retrieve the information with the help of directly just starting with the system and retrieving the data out of MySQL.

Differentiators from Existing Solutions

Now let's discuss about what makes private GPT different from the existing solutions available. So the first and the biggest thing is here we are allowing the user to pick the right models which they want, which is suited for them. It could be GPT for all, it could be LAMA2, we have it in our system, but it is totally suited and customized as per the user's requirement.

The second biggest factor is that we are utilizing fine-tuned models, which means that these models are particularly catered for the financial services industry, which is very essential, because as we all know, financial services industries are very peculiar. Security matters the most to the finance industry, especially because all the decisions that they make are primarily driven by data. So that is why fine-tuned models are one of our biggest, I would say, differentiators for sure.

Third, of course, the chat histories are always retained. So the users can just go back to the existing chat they had. They can build onto their queries and then generate more insights. So that is also one of the differentiators.

Last, I would say it also displays your citations. So let's say the user is actually chatting with a document, but they want to make sure that this is really true. They can just look at the citation, and then they can go back there. They can refer to it. This makes it very transparent and trustworthy. So, these are the key four differentiators for sure.

Importance of Private GPT for the Financial Sector

Now, why is it important? As I initially mentioned, there's a lot of time that these financial analysts actually spend on these documents to retrieve the right sense of information that they need. And they have to do it every single day. They have to spend a lot of time in driving that information, which they can actually utilize in any other work.

So the biggest advantage is improving their efficiency.

The second biggest advantage is of course an AI powered chatbot that is specifically catered to financial industry is very important because as I mentioned that finance industry all the decisions that they make is primarily reliable data and they do it in every single day basis. So this is one of the greatest advantage for sure.

And third, I would say that because the model, the machine learning model is being fine-tuned, there's a great possibility of reducing the errors as well as the hallucinations as we say in AI landscape. So this is, I would say, is one of the most important factors that is also differentiating us and makes it very important to have like a private GPT for financial services industry.

Product Impact and Demo Invitation

We do really feel that this product is kind of like a revolutionary product from our side, considering the concern that financial industry places in terms of security of their data. They want to leverage this generated AI capability, but privacy is the biggest concern. And that is how we think that this product solves this particular problem.

Now, I'd like to invite who is the co-founder and CEO of to describe a little bit about our product more in terms of the real-time demo. Thanks so much for talking to me. I really appreciate it.

Company Background and Private GPT Overview

This is kind of part of a spin out from our company called Anode AI. And essentially, what we essentially do is we help enterprises implement large language models.

And I'll show you a quick demo of how this product can actually work in practice. So you just go to this private GPT website. And essentially, you can ask questions on a variety of documents you upload.

So in this example here, we have a few documents. Some are English-based documents, and others are Japanese-based documents. You can kind of ask questions on them and you can actually get answers and you can kind of see like the answers from like the English documents are like from English sources. Right. And like the answers from these Japanese documents, right, are in like Japanese sources. And you can actually expand the source to kind of see the chunk of text.

And the way it works, this is like the public version, but we have like a private version that's like a desktop app that you can just download on your own Mac OS or Windows device. It's like a DMG file, and all the models run locally. All uploaded data is stored locally, which means it doesn't kind of get exposed to the public.

And I think to actually think about how to make a product like this good, I'd say right now a lot of these models, when you actually ask questions on these documents, to Sakshi's point, actually get a lot of the questions wrong. So our team's done a lot of work to actually benchmark a Q&A model such as GPT-4, Claude, LAMA2, and GPT-4ALL, as well as Mistral, on a bunch of financial data sets like FinanceBench from Patronus, as well as the Rag & Strug tester from LLMware. What you'll find is the answers that these models actually get are almost always wrong initially. They'll get 5% to 10% accuracy.

The text it gets is wrong. The chunk it gets is wrong. Sometimes you want to extract the specific information that says, I don't know what I'm talking about, or rambles on. So really, the question is, how do you actually ensure that the answers to these questions in a privacy-preserving way are actually not awful, right? Like not going to be perfect, but like kind of somewhat right.

Fine-Tuning and Model Optimization

So essentially, we've done a lot of work on fine tuning, specifically like parameter efficient fine tuning with like Laura and Laura, where rather than kind of retrain an entire model, you can fine tune large language models on, you know, your own company specific data. And we've kind of been able to do that in a way that's not super slow. So you can go from that 5% to 10% accuracy with a few model adjustments with the parameters to go to 30%, 35% accuracy.

But what we've actually found is a lot of the time, even if you optimize your performance on the answers, the answer the model will actually get is still oftentimes wrong because it's getting the answer from the wrong chunk on the retrieval. So we've done a lot of work on benchmarking as well as improving RAG with techniques like flare as well as hide to actually find the right chunk of text when you're doing the retrieval and how to like see whether the text you're getting from the chunk is better. That way, if you're getting your citation from the right chunk, you can also essentially hopefully get a better answer.

And I think one thing that's important to note with all this is right now, I think the state of the art of the industry is basically very subjective. Kind of the way people will evaluate these models is mainly based on human eval, where people kind of go through and they're like, hey, it's right or wrong.

I'll just make my prompts bigger, smaller, change some of the prompt text.

What's nice about, like, the data sets we benchmarked with FinanceBench and Ragonstruct is they were labeled. So you can kind of do things like cosine similarity or rogal or blue score to see some sort of string match as well as, you know, chunk matching. So your evaluation is based on some, like, labeled standard data.

But in the real world, the bunch of the data you're looking at to actually evaluate this is completely unstructured. So it's really difficult to even evaluate whether the answers to your text or the chunks you're getting is correct when you're really the only way to go through it is rather than using LME values that have a human actually go through.

Software Development Kit and Future Directions

So we've been actually working on building a software development kit, which essentially enables you to upload documents, chat and evaluate with the call of an API and abstracts a lot of the eval of these models behind the scenes via an API. And it's also, I'd say, still a work in progress.

But what's nice about this TLDR is rather than having to go through all your documents, you can go through, just chat with them. You can do it privately. You can get the answers through text. And the answers are hopefully a little better given the fine-tuning rag.

But thank you, Sakshi. Yeah.