Generative AI for Fact-checking

Introduction

A spoofed news article and the ease of creating misinformation

on the platform what's recorded now i'm trying to show like minimal um minimal technique and platform for how i've made this but it's kind of inevitable basically this is a this is a fake article this is made by weird which is a spoof of wired um and it shows that dinosaurs were discovered in the antarctic and it's got you know the ui that looks a lot like the website wired and there were options to tailor this to resemble different kind of websites with um various various like nuances they had a terms of conditions that wasn't the most robust and basically this was spun up with one of the like multiple app builders that exist on the market

And like, you know, if you read it, obviously I've given it a prompt that seems fairly ridiculous, fairly incredulous. But like, some people might believe this, and that might be concerning.

You know, this is a made up person. 1And 1this took literally like 10 seconds to make.

Initial fact-checking attempt

So I tried to use replica agents to spin up a little fact checking tool. And I gave the link to the source here. We also gave it an API key to see if that would help with the analysis.

And this was the initial result analyzing through a second time. I've given it the link to the website that I showed you just there.

And basically, the analysis isn't the most reliable.

Two approaches to verify the claim

So what we're going to end up doing is we're going to try two different fact checking techniques for this trial misinformation. So you guys have some tools to try to equip with this using generative AI.

Quick search and headline verification

So the first is just, obviously, replicate this claim. And just plug it straight. OK, there you go.

We discovered dinosaurs in the Antarctic.

The Newport Times is a website.

So it at least gives you that prank asset. It's helpful.

So straight into Plexity on search and filter your search, academic and social. Don't need to use deep research for this. Just to get an idea.

I should have asked to edit that. Do please fact check the following headline I've just seen.

Yeah, there we go. So it's not supported by anything.

Great, and hopefully we'll get that lots of the time.

Using Notebook LM’s Discovery feature

Another option is one of the features of Notebook LM we don't use huge amounts is the discovery feature. So up in here, before you even add a source, you can just put here and you can find sources from the web. So you're using this, using Notebook LM as a search engine along with many of its other uses.

So this is, yeah, just again, the prompt wasn't requesting fact checking. It was just the headline, but now I can import all of these sources and some of them are debunking that. So that's something to work with as we try to navigate this, this terrain.

Q&A and discussion

Okay, I think we'll leave the demo section there and open up to questions. Does anybody have any questions?

Connecting tools and data sources

Um, it's meant to be connected to other sources, but I've had trouble getting these applications to actually verify, like connect to other sources on the internet with like web scraping. So, um, just basically asking replet to build an app that would do that.

How Notebook LM fits the workflow

Notebook LM is a tool used as like a researcher's assistant. So I use it less for research and more for, sorry, less for fact checking and more for research in general.

Live notebook example: Kinds of Intelligence

Actually there's, I could try showing one of the notebooks I'm working on at the moment.

I'm part of a reading group called Kinds of Intelligence. I'm part of a reading group called the Kinds of Intelligence Reading Group.

Yeah, so you should be able to see the screen.

And just on LinkedIn over the last year and a half, I've been saving interesting papers in DMs by forwarding them to a friend. And I've just compiled them here.

Organizing and assessing papers

And one of the prompts I found helpful was please order these papers by length. in pages and know how formal slash technical they are.

And these are just 11 different papers on LLMs and how you might measure the intelligence of LLMs against animals and how you might identify biases, how you might value them epistemically.

Mind maps and other features

the mind map feature, which is really cool.

So it's going to take a sec to generate, but it's going to get there.

Those are the most lengthy and technical one. I am going to hold off, though, because I do remember that I think Alpesh is going to present on aspects of notebook alarm as well.

So I'll just do chat interface here. Maybe Alpesh can show you some of these other awesome features.

Daily workflow across tools

How do you use Notebook.LM, ChatGPT, and Chord in your daily workflow? How do you best use Notebook.LM, ChatGPT, and Chord in your daily workflow?

I like Notebook.LM for analyzing contents. I like ChatGPT with web extension and complexity as search engines for finding content that you might want to be reliable.

And Chord is great for building artifacts or visualizations of things. And then something like Replay is great for using apps.

Tools for hackathons and coding

I'm doing the NASA Space Apps Hackathon on a weekend for AI challenges. What are the best free LLMs our team can use on the hackathon?

I think most software developers' preference is Clawed code, but I think some people are enjoying Codex now. And as I say, you have some of these other AI assistant tools like Replit, Lovable. It's dealer's choice.

You really should play around with all of them, find you like them. One that some people are interested in is Gemini with, yes, the workflow again. Let me finish this question first.

One tool that some people are interested in is Gemini because it has the largest context window. So you can get the equivalent of 5,000 pages worth of text into Gemini in one go, which with the application Windsurf means you can do your whole library in there and basically look for blogs. It's less capable than Claude's.

And actually now I think 4.5 has just topped the benchmarks. Yes, sorry. So best...

sort of 4.5 we're supposed to even better yes sorry i was just getting to that thing i think i beat you to it by like one second uh according to the latest benchmarks on sweet bench um claude sonnet's 4.5 is meant to be it's meant to be amazing i think another one of these sessions maybe next month it'd be really good to show some applications that are more impressive than the uh fact check fact checking application i've just just demoed to you guys there

On the workflow piece, I like perplexity for searching for things. When you don't have a piece of information you already trust, perplexity is useful for finding that piece of information.

Then once you've got it, Notebook.LM is great for analyzing it, breaking it down, creating mind maps, creating queried questions. It's really good for fact checking because it doesn't you're wearing the documents is what's in the document. where in the document is the piece of information that you need to fact check a thing.

It tells you

We'll demo a nicer space apps on my terminal meeting. Amazing. That's absolutely amazing, Brett.

I hope to, excited to hear from you soon.

And that's right, Claude for building artifacts and apps is really, really good. And now 4.5, really, really good for coding.

Okay.

Reflections and cautions

With all that done, I think that was as messy a presentation I've ever given because I was trying to walk a difficult balance between trying to not give you tools for creating lots of fake news and showing the actual tool so that you all know it exists and trying to give some impetus here.

Why not ChatGPT? Concerns about agentic features

I skipped ChatGPT. ChatGPT is not my tool of choice, partly because I don't really

There'll be people in MindStone who disagree with this, but I'm nervous of the whole agentic trend in that, but I think particularly ChatGPT is leading the charge on. They've got Pulse now, which is meant to work for you overnight, which if you've got a product in mind is great.

But now they've also got this integration with Stripe, where they're setting up commerce and market, where they're going to make purchases for you autonomously. I don't think anyone wants to be the guinea pig of people who are trusting ChatGPT to make purchasing decisions. I can sort of imagine the slow horror of watching it put your credit details in for something that you don't want.

So, like, that's the kind of usp of chat gpt in my mind right now is they've got these really cool integration integrations um for for coding for artifact building application building i i just prefer claude um claude's also been more forthcoming about like research and um safety papers i've been more active in that space so that's my that's my preference uh but you'll find lots of people here who disagree and we talk about networking um

Closing remarks and next steps

because that's what we're here for. We're here to talk about these things and we'll try to figure it out together.

Okay. Great questions. Thanks, guys.

Antonio and Jared, if you're ready to join the stage, I might have to...

Finished reading?