Humanising AI in Digital Skills Training: Adapting to Real-World User Needs

Introduction

It might have made sense for me to be up here already, but I wanted to do that little jog. It's a bit like the generation game or something. Get a little bit of applause.

Can I see a show of hands? Has anyone here ever heard of Novella? No, that's good. I see a couple of like awkward laughs as if I'm gonna be offended.

You're not meant to have heard of us. So I'm assuming no one has used our product yet. No, no budding marketers in the room.

Overview of Novella

So Novella, what I'm going to do tonight in a very succinct 15 minutes is I'm going to show you a very quick demo of the product so you get a sense of what I'm talking about. And then I'm going to show, if I haven't already, my deep, profound intellectual limitations, because I'm going to talk to you about things that surprised me. things that I learned in the process of building this product, things that are probably common sense to you, but they hit me right in the head when this product went live.

I went in with a lot of assumptions. 1We launched a lot of AI products as part of this tool suite, and seeing users use it in the wild was pretty interesting.

We actually have two academic studies running at the moment to try and bear out what I'm going to say to you. But for the moment, it will be anecdotal, but I'll try and jazz it up.

Origins and Purpose

Novella, the idea with this tool is that it is a simulation, and it is first and foremost for digital marketing. It comes from two pain points that I had as two different user types.

I started digital marketing 15 years ago, and I was terrified. I'm terrified of most things. That's another limitation of mine, but in digital marketing, I was thrown in at the deep end.

I was spending American Express's money on day one. I didn't have a clue what I was doing. I did learn. I learned by being very timid and trying not to make mistakes because they're expensive. And I always wished there was more of a playground that I could play in. I could just spend money and see how it acted and really get to understand the technology behind all of this.

What is Google trying to do? I never really grasped that.

My second pain point, I am a business school professor, so I'm now trying to teach marketing. And I'm teaching marketing normally by showing them slides of Google Ads 15 years ago. I say, yeah, this is what it's going to look like out in the wild. These people aren't going to get jobs in digital marketing.

It's a practical field. It really is. It's a very, very practical field, but we're trying to teach them through passive methods.

Now, the grander vision is, I think, video-based learning needs to quietly die and it should all be a lot more interactive.

Product Launch and User Interaction

But anyway, I'll show you the tool so you know what I'm talking about as I dig in to, first of all, what we did when we launched the product. Second, what happens when the users interacted with it? That really showed me how idiotic I was being. And third, what does it look like now? So what's the change in the product?

So I used to work for Burberry. I've based the first couple of simulations around that experience.

So first of all, they have to sell trench coats. They come in, there's a challenge video.

It's called Novella, right? It's about short stories. It's about novel problem solving. And we have a little character called Ella.

So she's our AI co-pilots that I'll be talking about this evening. We also have a brand new, very jazzy meta ads. And we now have a tiny bit of money so I could get a designer to make it look nicer than the ones I did where I've shown my skill as always.

Now, if I go in here, I'm going to highlight where the AI components of this are. And I built these separately.

I was just really intrigued by what was possible. And this is before I hired a CTO to work with me as well. So she has obviously made it better and make it work properly.

The first thing we had was a Google Ads style interface. So the user comes in and they select 10 keywords that they think are going to hit this challenge, right?

But I also added in these personas. So we do a lot of persona-based thinking in marketing.

I had a thought of, well, what if we took the databases that we have for users from Google Analytics And then if we use that along with, say, just the OpenAI API, could we get the chatbot to act like different people so that this fictional brand could have real customers that they could interact with? They could understand them, figure out their customer journey, all of those different things.

So I started off with a small database of just demographic information, some basic psychographic information, but not much more. It was... We actually put it on the side because we thought we wouldn't use it that much. But each of these characters responds in a very different way and gives them different insights.

We also have, and I'll talk about this a little bit more, our Ask Ella chatbot. So this is where I learned an awful lot about how people use these co-pilots these days. It's a little bit surprised actually in the end.

I'll just come through so that we can see the third component I want to talk about this evening. So here they have to set bids a little bit like you would in Google Ads, but I don't want to go into the technicalities of digital marketing at the moment.

And this third one is the ad analyzer. So this will look at their ad and then it will compare it to Google best practices and also examples that we have fed into it and then give you feedback. So it will let you know, is this a good ad? Is this going to work for the company you're working for? Does your customer like it? And so on.

Now, it's a demo-based talk, so I won't spend too long looking at these things, but I do have a couple of schematics I want to show. Oh, how did that get in there? Yeah, sneak that in.

Yeah, so we started in January this year. We're working with quite a lot of universities worldwide and et cetera, et cetera.

Key Learnings and Surprises

Starting assumptions, this is where I got it way wrong. I did the classic thing of projecting onto the market what I would have liked when I started digital marketing. I thought they're going to have such deep, boundless curiosity for digital marketing that this co-pilot is, it's going to be like opening Pandora's box. They will not be able to stop asking questions about everything.

So how do we set it up? Well, I'll just walk through some of the basics of this, and then I'll go in and show you how this worked. So the query, my assumption was, they would go to the chatbot and ask something like, how does Google Ads work?

Not what they asked. I won't go into some of the stuff they did ask. They pushed the boundaries of the guardrails at times just to see where they could go, but anyway.

So the instructor document database, this is 15 years of my content on digital marketing. I fed in everything that I had from the basics to the most advanced strategies. Thought that would be a great thing for a co-pilot.

Our retrieval, well, it's fetching the relevant information to answer the query that the user has put in. Then it's creating a prompt. And then let's say we've switched it up over time depending on what works, but the prompt is sent to OpenAI for a response. So pretty basic, naive retrieval augmented generation.

Nothing too crazy. Quite generic, but I thought that's what they would want.

So what happened? They asked the bot questions like, Is this a good keyword? Why am I not hitting my goals? What's so good about Hotestuff Coats?

And I didn't really give you this information, but that's the fictional brand. I like puns, and that seemed funny to me last summer when I started building this on my own, because I didn't think I'd be on a stage talking about the stupid name I gave the fictional company. But life throws those things at you, doesn't it?

The chatbot, well, probably was thinking a bit like this. It didn't really know what to do. What do you mean, is this a good keyword? What does this refer to?

I can talk to you about Google Ads and how it works. I can't help you with that.

And if I pop back in here, I have just in a staging environment, the old version of this. And I asked a few of these questions earlier. So how can I select good keywords? I'll just highlight some of these points.

It refers to things that are outside of the simulation, which I thought is great. We need to teach them. We're creating the minds of the future. I had all these pretentious thoughts about Spinoza. This is how it should be.

They didn't care. They really did not care.

Go to SEMrush. Where? Where's the button for that? I can't see a SEMrush button in your platform. And they would start complaining.

There's something wrong with this chatbot. It's telling me to do things I can't do. Well, you've got the internet, don't you? You could do this.

So that was the old version. Very generic. They did not enjoy using it, let's just say.

I was very utopian. Often I am.

User Behavior and Interaction

So what happened with these personas? They use them like a crutch. I have to say, I think because they're going into these simulations and it's all new, They use the personas as a crutch. It's like a face that they can link something to. They can have a chat with them.

But they were asking each of the personas, so four personas, they were asking 11 questions to each of them on average. And they were asking them for hints on how to navigate the simulation. So they weren't stupid questions. They were actually very smart.

If you were searching for a trench coat, what might you search? That's relevant. If I know that's high importance, I can increase my bid and then and so on and so forth. So it actually affects your strategy, but we weren't prepared for that. What happened? This stunned us to the point that we thought something was obviously broken.

They're required to play three times and then they get a little certificate. That's kind of what Columbia wanted, but the average is 34 per simulation. We've had students play over 200 times. They still play late on Saturday nights.

They need to get a life if you ask me to be honest. I've spoken to one of them and suggested, Mate, you're 21. You've moved to London. Get outside. It's an okay simulation. I obviously think it's great, but come on. Get a life.

AI Co-Pilots and Trust

So what else has happened? The ad analyzer. I saw this happen in a classroom and it was a bit of a eureka moment for me in terms of how people use these AI co-pilots.

There was a student in the classroom who I could see in the back end had not submitted her score. She hadn't even played it once. Everyone else has played 10, 20 times. What's the problem? And she was sat at the ad analyzer and said, Ella, the chatbot, didn't tell me to move forward.

Yeah, she's not real. Why do you need her to tell you? I just thought she was the expert. I thought, my God.

So people are putting a huge amount of trust instantly in this. And because we had made it nuanced so that it would always find something, it would always come back to you and say, have you thought about this? Have you thought about that? She kept doing what it said and coming back looking for a thumbs up.

So we had to train it completely differently. We had to feed in loads of examples and give them a score. and then say to the chat bot, now when they input a query into this or they input an ad, if it is based on our schema, a seven or more out of 10, you need to tell them to proceed. You need to tell them it's good enough to go into the auction. So we had to be a lot blunter about how we dealt with it, but they just weren't moving forward. It was an absolute nightmare.

So how have we set this up afterwards? Obviously, yes, it gets a little bit more complex afterwards.

System Improvements and Feedback

So we have our query on the left. They come in and ask something.

What did we do? I'll summarize some of these key points.

Pre-retrieval, well, we just did a little bit to clean up the query that was going in, because they're quite messy, right? They could ask just about anything under the sun. So we're cleaning that up a little bit.

But the crucial bits are I got five people that I've worked with for the last 15 years to play the simulation and to write down all of the choices that they made and why they made those choices. We then labeled those as expert strategies.

from my last little minute or two, I'll show you a little bit of, and I was warned about the wifi maybe dropping out on me. So I've done some in advance, but I can ask this a few more questions anyway.

So here, if I say, how can I select good keywords? It's able to say, well, choose keywords with high search volume, low competition, relevant, and with a reasonable cost per click. These are all of the metrics in here.

It cannot mention anything that goes outside of that. And we have to be pretty strict about those things too.

So if I say, how can I interpret cost per click? It will be very succinct about it. If I ask it cost per click for trench coat is 299, what should I bid? It will say, okay, consider bidding slightly higher than the average and give me some contextual feedback.

If I come into this one, cause I think I have this set up. Hopefully this will give me the new version of the ad analyzer. This should give me very direct feedback and should tell me to move forward.

If my ad is good enough, my ad is on purpose, not good enough. So it's very specific. Your headline is not specific enough. I recommend changing it too. And it has to tell them what it should be changed to.

It explains why, which is of course useful, but that seems to be what the students were craving. So all of this change in here has resulted in an experience that is more succinct, much more relevant, much more contextual. And we also give feedback in the results to the user now, which we didn't do before, because I didn't think they would be that interested, to be completely honest.

Conclusion and Future Directions

So to summarize, I'm bang on time, unbelievably.

Yeah, Gen AI can be way too generic. We all know that. I actually thought that was a strength.

I thought I'll just feed in a whole load of documents and then it'll broadly know about digital marketing and it'll say relevant things. I kind of imagined not how I would have used it. I imagined how the ideal me would have used it 15 years ago, which isn't a great place to start.

Users put a lot of faith in AI co-pilots. But they really, really believe these co-pilots. They depend on it.

The study that we're running with one university in the US is to find out how this kind of learning facilitates lots of different styles of progress. So I mentioned that they play it 34 times, but if you look at the scores, you can see there's a lot about risk-taking behavior you can see within it.

Some progress really slowly over time. Some are really up and down all over the place. So just giving them access to the AI on their own terms seems to be working out really well.

And they crave contextual in the moment feedback. In fact, something else that we've learned is that they want it in their everyday work. They don't just want it in a simulation.

So I would say watch this space. The next product from Novella will be enterprise ready, shall we say.

That is just about it for me. Thank you very much.