GPT-4 As Quizmaster by Carl Partridge

Introduction

So my background is actually I graduated in AI quite some time ago, which is kind of ridiculous because back then an AI degree was a bunch of lectures and at the end of it everyone goes, but obviously all of that is theoretical. And now actually it's kind of a bunch of people going, this is what we trained for. It's happening.

And it feels like more has happened in the last... year in the field of AI, particularly with large language models, than in the preceding 15 years. I mean, it's been staggering and really, really exciting as someone that studied it all that time ago to sit back and watch.

I'm basically, I'm just trying to think what else I can say by means of an introduction. I'm a geek.

Personal Background and Company Founding

I founded a company in the transport tech space, which I bootstrapped and exited that company last year. My passion has always been software development and particularly interested, as I say, in AI.

The Genesis of Treasure Tours

Last year I had to take a group of about 80 entrepreneurs on a retreat and it involved a treasure hunt around a lake in Slovenia so that was my challenge and obviously a lot of people would have just maybe printed out some question sheets but I'm guessing a lot of people in this room would probably have taken that as an opportunity to over-engineer a full-stack solution which is basically what I did. This was how I ended up getting started in what has become Treasure Tours.

Let me just move on.

Demonstration of Treasure Tours

That is me and that is what Treasure Tours currently looks like. It is live in beta at treasuretours.org.

What I am going to do tonight is give you a quick demo of where it is at. how I intended to use AI to enhance it, and then talk about some of the learnings that have come from that. So let's dive right in.

I've got it up here. Let me try and do this so you can still hear me as well. Hang on. Move this around slightly.

Okay. Right. Okay. Everyone can see that?

So here's the website. We'll go into a bit of techy stuff in a minute. I know we've got a mixed audience tonight, but it's essentially a marketplace for treasure hunts.

The idea is to allow anybody to create a treasure hunt, host the treasure hunt, and ultimately sell the treasure hunt. This is an example of one of those treasure hunts. So if you were using a mobile phone and you were in London's West End, this is what it would look like.

Essentially, it's a series of steps, as you'd expect, presented in the form of a bunch of text, like a sort of enhanced multimedia WhatsApp, really. You get asked, are you ready to go? You can choose a bunch of options. You can respond.

You get handed over to a virtual host for the treasure hunt. This is one that we did for Christmas. So it's Santa's elves. The challenge is to get around the West End in time because Santa's elves are missing and you need to be the replacement elf.

And then you get shown a series of maps to move around. It will show your location to help you find the next place. And then you get asked a series of multi-choice questions.

Are we back on? We're back on. So in this case, get to Leicester Square. Amongst the movie-mad statues in Leicester Square, which of these will you not find? Is it Harry Potter on a broomstick? Is it Indiana Jones with a whip? Is it Bugs Bunny with a carrot?

Or is it Beyonce with a hamburger? Did I mention this is going to be an interactive talk, by the way? Anyone care to give the right answer? It's Beyonce with a hamburger. That is absolutely right. Ten points to the very knowledgeable lady in the third row.

There we go. So that's what Treasure Hunt looks like. That's the web client, and we'll look at the tech behind that in a minute as well.

Behind the Scenes: Creating a Treasure Hunt

So behind every web client is an editor, something that allows you to actually create the Treasure Hunt. So if I go to Dashboard and I click on Create New Tour, If you were a brand new user of the site, this is what you would do.

You would give the treasure hunt a name. Let's say we're going to do one around here, so Clark and Well Hunt. You give it a location. So again, you can choose from a map. That'll do. That's exactly where we are, so we'll do that.

You can give it an image if you want. And off you go.

now when I first started the site it became apparent that the process of setting these questions can be quite laborious you've got to look around the area you've got to find places for a question so you would say let's have a question down here add a question here and then you would go in and you would create one of these multi-choice questions so let's say for example we're in is it St John's Church we're in I think it is what is the name of the splendid church filled with AI enthusiasts every month. And then you would enter in your choices here.

So is it St. John's? Is it St. Luke's? Is it St. David's? Or is it Happy Church? There we go. It's St. John's. You can add in various bits of multimedia. You can add in an image to illustrate that. And you move on to the next question.

Incorporating AI into Treasure Tours

that's the manual process so after I'd created this solution I thought there's got to be a way to harness the power of large language models to basically start automating what I would call the boring bits or the expensive bits really if you're trying to roll something like this out potentially commercially so the four challenges I set myself were well first of all maybe GPT can help to theme these treasure hunts in some way so it's a Christmas hunt we've done a whole bunch of stuff with elves that's going to go out of date fairly quickly wouldn't it be great if we could hit a button and we could re-theme that hunt to be about the Easter Bunny or something for families again a very manual time-consuming process you would think GPT would be very good at that kind of thing secondly sourcing questions can we get GPT to actually write the questions for us

sourcing pictures, can we get it to illustrate those pictures with less expensive non-stock photography and maybe also drafting out questions as well so let's have a look, this is the architecture

Technical Architecture

We've basically got a full stack setup with a basic client server architecture in the middle. The web client is React TypeScript. The server is Netcore. We're using MariaDB for storage.

And then I've flipped this upside down to the way you traditionally present this. So you've got a whole bunch of third party services. We're using Stripe for payments, Cognito for auth, And right in the middle is the interesting bit.

Audience Interaction and Technical Details

We're using the OpenAI APIs for our AI generation, GPT-4 and DALI-3 under the hood. Right, let's just figure out who wants the geeky stuff. Can we just have a quick show of hands?

Who would consider themselves technical, a coder, engineer, basically technical? All right, and flip that around. Can you put your hand up if you didn't have your hand up just now?

Okay, it's about 50-50, so we'll go for it, but we'll keep it to the point before your eyes glaze over. Okay, so I've integrated... Yeah, if you're not technical, just bear with us. We won't go deep on this, but I think it's probably quite interesting for the people that are.

AI Client Integration

Obviously, the back end is net core, At the highest level, we've got an AI client. We're actually using a third-party NuGet library called OpenAI.chat, which I found was... Say again?

Yeah, of course I can. Hang on a minute. I think they're going to share the slides afterwards as well, so don't worry too much about being able to read every word.

But there you go. Yeah, so we're using OpenAI.chat as a third-party client for the OpenAI API.

I found that to be pretty usable, actually. It's in active development. It's getting updates pretty much every week, as far as I can see.

It implements all of the latest endpoints. They've just implemented the GPTs as well, as in the all-in-one ones that store the context for you. So that's actually a library I found that did a lot of the heavy lifting.

AI Client Test Drive

And on top of that, I built a basic client. And the way in which the client works Let's just show you here.

Essentially, let me find the right bit for you. There we go. Essentially, you've got a method called AskOpenAI.

You pass it your prompt, and what it does is it maintains a local collection of messages. Essentially, we're managing that context locally on the server side rather than at OpenAI. And we're appending our latest user message to that, sending it off to the completion endpoint, and we're getting a message back, which would be called a system message.

And the roles are recorded. And then, of course, under the hood, every time you send a new user message, it's sending the whole lot of the context back to the completion API. OpenAI will now do that for you if you want to use their GPTs at cost.

So there are trade-offs there. But this is the way I've implemented it.

And let's just take it for a spin. So there we go. AI client test.

We're bleeding over the left-hand side of that screen there, aren't we? There we go.

Right, so if you want to get GPT to write a question for you, you have to flip around what you would traditionally do. So you'd say, write me a question to which the answer is Buckingham Palace. Still never ceases to amaze me.

I don't know about you guys, but every time I develop against large language models, it is tremendous. Now, I've got it set to ultra-geek mode. So it's come back with a question that is almost indecipherable. In which esteemed residence would one find the monarch of the United Kingdom's principal place of abode, etc., etc., etc.?

So what we can do is simplify that. So above my open AI method, I've added in a bunch of style definitions. And the way in which these work... Again, I don't know whether you can see, but essentially what we do is we just append to the prompt.

We just append a whole bunch of additional predefined prompts. So based on the audience, the tone, even the sense of humour, we inject a whole bunch of additional prompts after our original one. So here's an example of the one for audience. If you say your audience... is adults, then it says use language appropriate for adults. And you can crank that up. Children, educated people, or what I call intellectual masochists, which is the level that we're on just now.

So if I want to dumb down my question to OpenAI, then what I would do is I would just go into my client, and I would change my audience type to be, let's rewrite the Buckingham Palace question for children. So write me a question to which the answer is Bakhyam Palace. So again, it's sending off exactly the same prompt, but it's appending a different set of style and tone guidelines. So in theory, we should get a more accessible, family-friendly quiz question. Okay, what is the name of the grand and illustrious residence in London? Blah, blah, blah. It's a bit long.

So one of my other key learnings was it waffles. So how do we stop it waffling? Well, the approach I took to stop it waffling was essentially... to write in code a form of while loop that just tells it to just keep making it shorter until it gets to the length we want. So you can pass a maximum number of characters, and if it waffles, it will then send another message transparently to GPT, and it will say, write this again, but make this shorter.

So let's pass it final test of this. Maximum length is set to 500 here. Let's just say no more than 80 characters. Now let me see if I can actually remember to copy-paste the question this time. There we go. Right, last try. Write me a Buckingham Palace question. And it will probably write that question twice. It will come back with a long response, and then we'll tell it to keep it short. You can also set a default maximum length, and that would be an interesting future enhancement to the project. There we go. What is the Queen's official London home? So you can tweak these settings, you can build a whole bunch of UX on top of this, and what you end up with is this very powerful foundation that allows us to interface with GPT and specify all of these different things that we want. Right. That's at a very, very granular level, that's how it works.

Creating a Treasure Hunt Live Demo

Let's make ourselves a little treasure hunt if we've got time. It might be quite fun to do that.

So hang on a minute. What we will do is we will do it locally on my machine. If the lovely folks at Mindstone will please tell me if we're running out of time. We'll go as far as we can with writing our own treasure hunt, and then I'll talk about a couple of learnings.

Okay. Right, we're going to run a local version of Treasure Tours so we can do our worst on it. Give me a second for it to spin up. And we're going to do Treasure Hunt around Camberwell.

And what we'll do is we'll take a look at what it's actually asking GPT behind the scenes. So if I go to create a new tour. So remember previously, this is a very cumbersome process. You've got to write a description. You've got to pick an image. You've got to come up with a theme.

So here's a little switch. Create AI tour. Let's leverage all of that AI goodness under the hood. And I've built... a whole bunch of UX that actually you'll recognize these dropdowns for humor, tone, and audience because they're exactly the same as the ones that we're using in code.

So you log into the site, you say, I want to create a new tour, and we're going to come up with a theme. Now, if we had a bit more time, I was going to ask you guys to come up with a theme, but time is getting away from us, unfortunately. So we'll have to go with whatever I've got on here. I've got chefs. and chef in trouble the head chef is seriously concerned the new restaurant has its grand opening tonight and we don't have any ingredients to cook the signature dish that's going to be our theme that is passed to GPT we're going to ask it to create a thumbnail image so it's going to illustrate the tour for us as well and we're going to ask it to do a few little bits of writing halfway through the tour at the end of the tour to update players how they're getting on I'd also like it to make a bunch of images as well

Right, so we'll put that to work. You don't need to give it a title. We will put a little place marker in there as well so that we can come back. There we go.

Right. I'm going to set that off, and it's now talking to GPT. It's getting GPT to write all of that theme, and it's also getting GPT to pass prompts through to DALI 3 to write the question, to illustrate the images. And that was another thing that I found developing against this. GPT is much better at writing prompts for DALI 3 than I am. way better, I mean I can't be bothered half the time I'll just be like, go on, draw me a picture of a bus stick some things around it and actually if you get GPT to do it, it will take the time and it will know the context of the treasure hunt and it will really go to town on it and it mysteriously I don't know whether they've trained it on what good generation prompts are or whether this is just some facet of of the way in which the weightings work, but it is really, really good at those.

So let's have a look. While it's right, it takes about a minute and a half, and I can actually look at the logs while we're going. So let's take a look under the hood at what is actually happening here. Let me just find the beginning. We've got some stuff in the log from earlier, so I just want to... Okay, I'm not going to read this out loud but hopefully you can see or at least skim read some of this This is the dialogue that's going on with GPT behind the scenes We're asking it to write a treasure hunt We're telling it who the person speaking is. Each time we're passing these style definitions. So we're always going to get consistency of tone, sense of humor, audience.

If we want to run it again and we want to make one for kids, we want to make one for adults, just repeat the process, change the UX. Now we're asking it to write a prompt to instruct an AI image generation tool to create a picture to accompany it. This is the prompt it came up with for DALI 3. Very, very good. Create an illustration for a thrilling treasure hunt. I didn't say it was thrilling, so I'm delighted that it's come up with that. A bustling cobbled street. Chef outfits. Holding a map. Characters carrying oversized kitchen utensils, like a spatula or a whisk. Again, you know, the... The invention is spectacular. And when it works well and it's translated into images, it really is a very powerful way to instruct an automated image generation tool.

Review of the AI-Generated Treasure Hunt

Write an exciting 30-word briefing. Write a prompt. Again, we give another picture here. An update. and so on and so forth. Write a one-page paragraph to advertise the treasure hunt so it goes on to ask it to create the descriptions. You'll see here it's triggered our little while loop. It wasn't happy with this one, it was too long. So again, it's automatically asked it to make it shorter. And finally, write a very short title for the treasure hunt between two and five words, London's Gastronomic Chase.

Right, let's see whether it's the moment of truth. Don't judge me if it's awful. There we go. That is our themed treasure hunt, London's Gastronomic Chase. We'll just publish it locally so we can see what it's come up with. That's not a bad starting point, is it? We've got magnifying glass, we've got silhouettes of chefs, we've got some London landmarks. It's much better at very well-known cities like London. When I started doing these in Birmingham, it was throwing in some really weird stuff. Join the culinary quest in London.

It's marketing descriptions in there. Let's play it, okay? There we go. All right, team, listen up. We're in a proper pickle. There's no ingredients. London's hiding what we need. I'll share these afterwards because I don't have time to read them all out. But it's always great seeing what it comes up with. And there appears to be some sort of group of chefs walking down a fictitious road in London armed with spatulas. Marvelous.

Conclusion and Key Challenges

Right. Now, if I've got time, I want to just show you how AI helps to actually write a question as well. But I'm just looking around. How are we? Do we need to wrap up? All right. Well, I think we are really, really low on time. So another time, another place, I will share the other bits of AI that we use to actually write more of the hunt, including sourcing the questions, which we get from Google Places via different street names, place points, pass all of those to GPT. It's very good at writing multi-choice questions. Right. Let me skip to a wrap-up then.

Here are the key challenges. I've already spoken about a few of these.

It waffles. Watch the waffle.

It's non-deterministic, but we know that. So it's very good for question setting, perhaps not so good for marking. So if you're thinking about AI applications in ed tech, for example, there's some way to go for fully automated marking.

I was having this conversation with my wife the other night, who's a teacher. I'm very excited about how you could apply some of this to ed tech.

It's not integrated with Geography, geodata, if you've ever asked GPT to take you on a tour of somewhere, it can do some really weird stuff because it's just not interfaced correctly with that. I'm sure those can be overcome, but in its current version, I wouldn't wish that upon anybody.

There's the issue of where to store the context that we've talked about. Keeping it engine agnostic. Again, do you really want to build all your tech against one paid platform? Probably not. So I think there's opportunities for an intermediary layer there. And keeping it affordable.

Next Steps and Final Thoughts

Next steps for me, I think this is out there in beta. Go and check it out, treasuretours.org. I'd like to document the API. I think we could do some really interesting stuff once we make automated Treasure Tours VR, for example, could plug into the API and just get the project generally enhanced with multimedia.

And that is that. My final slide says hire me, bribe me, or kidnap me. I'm available if you want to talk AI. I love talking about software architecture, how we can integrate and help businesses using GPT. My details are up on the screen.

Next time we chat, I'll show you the rest of the treasure tours. And I hope that was interesting and useful. Thanks very much. Cheers.