Talkblue.eu: a privacy friendly AI assistant

Introduction

Hello, everyone. My name is Ruud.

Setting Expectations

And before I start, a few disclaimers.

Up until 15 minutes ago, I wasn't aware that I was going to present something tonight. So please bear with me if I make some beginner's mistakes.

It's also the very first time explaining what I've been working on in the last weeks. And yeah, please bear with me if there's some silly mistakes in there.

Using AI in Everyday Life

Does the audio work? Yeah, it's all good, thank you.

Personal Interaction with AI

Then I would like to start with asking a few questions first. Who here of you uses ChatGPT to analyze images, like to send a photo and ask, what is this? Okay, I see a few hands here.

Who here uses it to ask really silly things like, how does my dishwasher work? Or how does my washing laundry machine work? Including a picture maybe. Yeah?

Then a little bit of more sensitive question. You can ignore it if you want to, but who here uses it to ask health-related questions? I see a lot of hands.

I do as well. Every time I think something is wrong, I basically ask JetGPT and it often reassures me. It's quite nice.

Then the next question, who uses it for work-related things? Like to draft a contract for school, for whatever. For programming. I'm a software engineer myself.

Privacy Concerns

Who here feels sometimes a little bit bad about how much ChatGPT or Claude or Gemini knows about you? Does anyone ever feel a little bit bad about this?

Like recently, Surfshark, it's a company that is all about privacy on the internet. They released a report that all these AI assistants, they collect almost all of your data.

So everything that you ever put into it, it knows about you forever. It will store it on their systems. They can reuse it for whatever they want.

If it's a company in the US, the US government can... request access to your conversations. All your data just freely flows from the EU to the US.

And this is something that I feel like maybe is an underrated or maybe undervalued form of Yeah, maybe some privacy concerns here.

Addressing Privacy Issues

And 1I set out on a mission to kind of change that and to be able to have people use large language models without having to worry about privacy constraints so much.

Impact in Professional Settings

Imagine you work at... I don't know, OECD.

You don't want to, you know, like when you're advising a government, you don't want to ask CGPT about a specific problem that Germany is facing, for example. You don't want to do that because you are literally sending your data to the US. It's probably not a very good thing.

Leveraging Open-Source Models

So I set out to change that and I tried to use open source models There are some open source models available. And open source means that you can kind of freely use it to generate output.

Let's see how much it's... 4%?

I don't think this is a good idea to unplug it. So let's see if it works. Yeah, here we go.

So I'll go back here, just so I can point out a few things. And then I will give a quick demo, hopefully, if it works. It's a live demo and I didn't prepare it. So, you know, things always go wrong when you least expect it.

Building a Solution

Yeah, so I built a tool where you can use open source models, QUEN, DeepSeq, Mistral and Meta's LAMA. And what it does is it's completely local here in France, actually in Paris. And all the data that you send it, all the questions that you ask it, remain here in Paris.

And I went quite far with this. Even if you ask it questions like, what's the temperature in Paris? Or tell me something about Marie Curie.

It uses quant to search the internet. So it will not even use an American search engine. Everything is fully European in here.

And it doesn't send your data away. And yeah, you can use the open source models that are available.

Current Challenges

Currently, there's a few challenges hidden here. The first being that European infrastructure, cloud infrastructure, especially with AI, it's very new. It's new for everyone.

At my job, we use Google Gemini very frequently. And even they have weekend long outages very frequently. We've had like two in the last months. And for a company that demands 100% uptime from basically everyone else and from their own products, you can see how new AI is even for these huge companies that spend millions, billions on infrastructural costs.

The challenge here in Europe is that it's even more new here. I don't think that, besides Mistral, I don't think there's any European LLMs. I think there's an initiative where some European researchers try to create some open source models, but I don't think it's quite at the level of the US yet.

Very recently, today, right now actually, the provider that I'm using, it's called Scaleway. It's here in Paris. They have an outage. So I don't know if everything will work when I do the quick demo.

So yeah, bear with me if there's like some small mistakes. I hope they will be able to fix their outage very soon.

Live Demonstration

Without further ado, let me just show you how it works.

As you can see, you can use Mistral. One of their models is open source, so you can ask it things.

You can use DeepSeq, but one of the distilled versions of DeepSeq, which means that they trained an open source model with their own open source model to make it cheaper and better, basically. And you can use Quen.

And I can ask it, for example, who are you? Okay, so then it will give you a response about who TalkBlue is, like which model you're using, why it's important that everything is hosted in Europe, that data retention is absolutely minimal, like I saved absolutely nothing, like no messages are saved.

One disclaimer, when it searches the internet, and it visits a lot of websites, I have to save which websites it visits. There's nothing I can do about it, otherwise people would abuse it and I might have to block certain websites and queries from happening.

But yeah, you can ask it things, for example, I can start a new conversation and I'll ask it about who was Marie Curie? And here is where it probably will start to break down, because this functionality broke down 30 minutes ago because of the provider, but it will hopefully start working again very soon.

Basically it searches the website and it would normally, and hopefully very soon later today, come up with a response. In its reasoning, you can also see what it's trying to do. You can see that it's trying to look for the information.

It already gave some information about Marie Curie, but it's trying to finish the search results, and sadly, that broke today, right before the presentation.

Future Prospects

For the future, I want to use better models. I want to use the full DeepSeq model, But to use it, it costs 3,000 euros at least per month. So it requires a bit of an investment up front, and I would preferably want to have some users before I try to do that. But it's one of the main wishes.

User Feedback and Improvements

As you can see, I hosted a quick survey, as you can see here. And here you can show what do you actually want if you were to pay 20 euros per month, what would you expect of the model, basically. And almost everyone, including myself, said we want a higher quality for the future.

So I think using the latest open source models will be one of the first next steps.

Vision for Europe

And yeah, I hope that at one point in Europe we can start competing. That's the vision for the future, that we can start competing with the big guys and not just on privacy and on security, but also when it comes to features and quality.

Conclusion

And yeah, I think that's the pitch in five to 10 minutes.