Working with data you shouldn't share on ChatGPT

Introduction

We work a lot with financial institutions, so we have a bit more sensitive data. So sometimes we recommend to not just share anything on ChatGPT.

It's a bit like this, how do you say that, this pre -Facebook area where you just shared whatever you had on private information and five years later when you tried to get a job, you realized, yeah, maybe I shouldn't have shared this picture or this comment.

And I think ChatGPT is that sometimes a bit on steroids because it's just private data but it's a company data as well and sometimes yeah you need a safe environment and you should think of it what you share and

what you don't share all right enough on that I will show you a bit why it

Why APIs and Hosting Choices Matter

matters or why it's relevant so we work a lot with with API's and now with chat GPT what does it mean so in chat GPT is like a platform you have a lot of it

It just works. You type something in, you do reverse prompting, and it perfectly works, right?

If you work with the models, the APIs, it's a bit different. And why should you do that?

Creators vs. Hosts and Data Visibility

The reason is sometimes it really matters who is the host of the product. And why is it relevant? So the creator creates it and the hoster that makes it accessible. So it has the brain and it gives you the brain and it accesses it.

So if you use LLMs, you can always think of that. So you have a creator, can be a company like OpenAI, DeepSeq, Google, and Tropic. They create the models. They train it and they host it mostly themselves as well.

But the reason for it is it sees your input and your output data normally. So that you should be a bit like if you put it into practice,

DeepSeek can be a great model to use.

Self-Hosting vs. Using Apps

and you can host it yourself as a person or as a company and then it's quite safe but if you use the app you will most definitely share the information with china and the hosting provider of that so that's why yeah you you shouldn't share everything on that or you should be more aware what

you share exactly um yeah maybe i don't want to get too technical but it runs very technical very fast so if you don't understand something just hands up or was this

clear so far good all right perfect so now we're in an environment where we

Handling Sensitive Data Safely

think okay we shouldn't share so maybe it's client data maybe it's your revenue forecast of a client or it's a contract that is on term how do you say that that you can't share with the public yet or it's not open yet there it's very

Consumer Chat vs. Enterprise Settings

sensitive right and how these models usually work and how if you use ChatGPT as a private person how it works is you give something in and it's a data company in the end so it trains model and it gathers data so if you don't use

enterprise or a paid subscription you usually are per default in this data gathering mode so they use your data and they retrain models from it that doesn't mean your data is being used but it just means it's there and it could be be used.

So for a sensitive stuff or for very personal stuff I wouldn't share it.

From ChatGPT to APIs: What Changes

Alright, so that was a long execution and now we talk about why it matters and what's different.

Working With the "Pure Brain" via API

So if you then don't use these fancier tools that are quite a black box, you go with APIs. So we go with the pure brain

and API is basically a GPT -5. We have a deep seek here as well.

So this is a platform we built to train and test and learn and we work mostly with financial clients in Switzerland so we build agents and environments for them to use

Data Residency and Model Choices

it in a safe way and you see a bit red and a bit green so the red means data is going to third parties green means it stays in Europe and green green means it stays in Switzerland but as you see we have quite a lot of different models and

Live Demo: Models, Freshness, and Retrieval

And what I want to show you a bit is what happens if you use APIs instead of ChatGPT, et cetera.

All right, so I have a GPT -5, I have a Quen. Any other model of choice you want to try? So we have Sonnet from Entropic, for example, a Gemini. Any ideas, any wishes?

The last one is yours. Gemini or Sonnet. You have to battle it out. Gemini, okay.

All right, so we have GPT. We have, Quen is from Alibaba, it's a Chinese open source, and we have Gemini, which is from Google.

And what I always think is fascinating is if you let them play a bit against each other. I guess we test it, but normally people have a model of choice or a model they like, and I tried to make it fun. It's not as fun as Suno, but as much fun as data privacy can be.

Simple Question, Different Answers

So, I will ask, and I test this today, who is the president of the United States?

And you can simultaneously ask that on ChatGPT if you want, but it shows you a bit the tricky part as soon as you start working with APIs.

Search vs. Training Data

Because if you use ChatGPT, what it does is a direct search, right? It goes into internet and it finds the source.

chat GPT two years ago was not that great so it didn't do the search so far so you went into training data and as soon as you work with API so as soon as you don't have the full platform you usually work with with outdated data so this is the information the model has when it was trained and as you see it's

good they got better so at least they mention now what's their data point is So it's June.

Do you see it, or do I need to make it bigger? It's good. All right, as you see, they're quite so...

Quen was on June, then we have GPT -5 was updated in October 2024, and Chami and I, I'm not entirely sure.

The Staleness Problem

All right, so that leaves you with a bit of a conundrum. Yeah, you can take it or you can leave it, but how do we make that better now?

Feeding Context: System Prompts and Libraries

and you always make it better if you feed more information. So we work a lot with data libraries and company data.

So it's not great if you just want to ask who is the president, but, yeah, we try to make it better.

And I prepared this a bit for you. So we're going more into... I already prepared that, but we basically just go through it.

Building an Agent

So I created an agent, an assistant, a GPT,

and what I basically did was I had to give him the information right so the model has outdated information so how do you make him smarter and this is a quite

simple stupid example in a way but you can do that maybe for Swiss tax law or for some contracts just think of it and how we do it so this is a GPT builder

basically and you have you can give it a name and you can choose intelligence you can choose a brain behind it. Any ideas which brain you want to choose?

You can try a quen for example. This depends a bit, right?

If you say hey it has to be super sensitive, it's super sensitive information, then I would choose the Swiss model and otherwise I would choose probably a European hosted one.

I choose a quen and this one I just copy pasted it from JetGPT, basically did the research and said it's just research anything that happened beyond when Trump get elected. And I put that in the system prompt.

The system prompt is basically nothing more than, if you have your chat window, your prompt window on ChatGPT, that's how you instruct and communicate with it. And the system prompt is what goes before that.

So basically what happens now, this information is sent to the model with your input and then everything goes and is calculated then the answer comes out.

Prompt vs. RAG Library

We work quite a lot with it and you have always two possibilities, either way you put it in here or you put it in a data library and we can show you a bit what's the difference and what not.

Our learnings with it is if it's less than 100 ,000 tokens I put it in here because it's it's more accurate.

And if it gets bigger, if the libraries are 600, 1000, 10 ,000 pages, you normally have to put it in a rack library.

So you have to create a data library and then the system works a bit different.

So for this one, I just threw everything in a system prompt.

I gave him a nice picture and we save it and created another one for the fun of it with a rag library.

Basically quite stupid, I can give it the same model maybe. We use

a coin as well and I put the data in a rag library. So here it's not in the prompt itself but it is in a text document.

I can quickly show it to you maybe.

So you basically have the same information but in a library and you You will see it in a bit how it changes or what changes in the result.

Running the Test

All right, so we have our new agents and we do the same question again. So I choose the president agent and the president rag agent is the president of the states. I need your input, then it goes a bit faster.

And so it's quite interesting what you see from that now. You've seen the the president agent was faster than the president agent rag, basically. This is because it doesn't do the search.

How RAG Selects Knowledge Chunks

So what happens here, it

basically goes into the library and it sees who is the best knowledge chunk for this question.

So if you look at it, a chunk, I do so many words here, but a chunk, you basically think of it like a librarian, but the librarian only can can carry three sides, three pages per book.

So what it does, it searches for 10 times three pages. She can't carry more of one book. And it tries to find the most relevant three pages that there are in the whole library system.

So from all the books there are, from all the information you have, it just always extracts three pages that match the question the best.

And this is this relevance score you see here. And based on that, it answers the question. So you see here, this one had quite a high relevance.

and then it's not so much. Sometimes it's good, sometimes it's bad, but this is, yeah, this makes it better or less.

And if you just throw it in the system prompt... All right, this one it didn't get, huh? All right, good.

Observations and Variance Across Models

So it's interesting that the other model was right. All right, so the only, the only, the only winner we have right now is the rack system, which is interesting.

So I haven't tested it on a QAM. I can change that quickly, maybe it gets better then.

But yeah, this is a bit, if you run with models, sometimes it's wonders. As you mentioned, it's quite generative, and you don't always get the same answers when you test it.

Let's try this again with the other model. All right, I will just explain in the meantime.

All right, now it gets it right.

So depending on the model, it got worse when we used it. That's a bit sad, but usually it gets better.

But so with open source OSS, it still works.

Best Practices for Sensitive Use Cases

And long story short, what I tried to tell you is, if you work with sensitive data, the challenge is you can't always use the most performing or the best models. So what you try to do is make them as good as

Clear Instructions and Curated Knowledge

possible for the case being and how you can make that is through clear instructions so you give us we did before we fed the information in the in the system prompt or we give it the information in the knowledge library and

this is basically yeah the best reasons how to feed it with the knowledge because otherwise what happens it will go into the train source and this is not

always right and especially if you work with yeah with tax law with lawyers and and so on, and so on.

Quality In, Quality Out

It really depends how good your knowledge is. If that's not good, the result will just be shit in, shit out, unfortunately.

Q&A: Hosting, Privacy, and Libraries

Good. I think I showed most of it. Any questions on that one?

I just flooded you with synonyms and technical details. Yes? What are the ..

MARTIN SPLITTMANN - Yes.

Who Hosts the Models

We work with Peak Privacy. It's a Swiss player.

And what they basically do, they host most of the models. So if you think of AI, right?

you have probably like 10 providers plus give -and -take right you can choose for I would choose from that are leading it and you have always different access

points so you can just lock in via API so it's basically opening I host the model and you then retrieve it from them but the processing and the hosting is at their place or you can try to host the own your own models if you want to play around with that.

Ollama is a good tool where you can do smaller models, download it onto your laptop and just play with it.

What we do, we host it for ourselves in Switzerland or Finland and then we give access to our clients in that sense.

Privacy Guarantees and Black-Box Processing

What are the privacy guarantees?

I guess one of the things is that it's hosted and processed We don't see the data so how we build it is it's a total black box what do you

write in the model and I think that's what I should always what I would always yes it's it's your we it's like a black box we don't access your input output and saving data which is provide the models but we don't have any access to what's happening or what's what's played in this and I think it makes sense like

this but yeah I mean open AI to some extent if you lose 13 billion a year you have to get something in return and mostly if you don't pay for it it's data so in a private mode I think that's the case um but yeah this is just more on

Curating High-Quality RAG Sources

sensitive face you can if you want how we usually do it is we build the library per case. If we have a client and he says, hey, I need the canton of Zurich tax law, then we try to build the knowledge library for this specific case.

And why, if you want to have high performance, you have to be very critical what data sources get pulled in. The issue is a bit, and you saw it on the rag, it just pulls the best matches to your question, but it can't really decide.

The best question with SharePoints is always you have like a final, a final final, a final version 3 and this is definitely the final version. But LLM is very, it can't really decide which is the best document or which is the most accurate one.

So you have noise in the system, right? So you have this, then it chooses maybe an outdated tax law and you just want the cleanest version possible.

So that's why we, if it's a case that does make sense, we usually build it from scratch and we build a very specific knowledge library for that case to see the best the best performance if you use rag because otherwise you have a lot of input but the the LLM is it's not a wonder weapon yet it it can't decide which is the best source it just sees the best match so I think for companies

Policy and Tooling Recommendations

I would always start educate your employees and maybe start with some strategies and tools you use.

And the paid versions mostly make sense from a private team perspective, right? Because you can opt out of retraining and you can delete the, so you can do a lot on that side.

Beware of Free Meeting Bots

I would be very, so one example we always, I think is hilarious, was this meeting note takers, everyone invited, mostly a free version, somewhere hosted in whatever, and you basically sold your video, your voice, and your clients' video and voice data. So it was hilarious.

And we had customers in very sensitive cases that just flooded us with that. So I think in this case, I would be a bit more serious about it and just be a bit aware.

Would you want Google or Facebook to have this data piece? If it's very sensitive to you and to your future, maybe be a bit more hesitant.

And it's quite difficult because these models are good, right?

Limits of Anonymization

So in Switzerland we have this famous open -cellar cheese and the recipe should be very secret or the secret Coca -Cola recipe.

And then you can think, yeah, let's do anonymization and just delete open -cellar and cheese out of the recipe. You still provide the recipe basically. basically.

And everyone that knows you and knows where you work, it's easy to reconstruct.

So yeah, just be a bit aware of it. But yeah, they're too useful sometimes.

Conclusion

Perfect.

I think I'm in time. Thank you.