Voice, Brainstorm and Superpowers - From 0 to boardroom ready in 15 minutes

You all a few hours ago should have received an email that asked you if you wanted to opt into an introduction at this event. I mean, all of you should have received it. Maybe some of it went to spam.

That app was entirely built using Replit and Cursor. Not a single line of code was actually written.

So I did that on a Friday morning. It took me about four hours to build.

And then I just uploaded the entire list of everyone that is coming to the event. And now it's making introductions for everyone that is here.

So it gives you a live example of something that is actually useful today.

Now, so Greg did use Super Whisperer, which is not fair, because that was the main thing of what I was going to go through today.

This has been a recent unlock for me, so I want to, let me just get my notes here. Obviously, voice-to-text has been around for a very long time, but what recently dawned on me was just how much

Voice Models and Language Models

The combination of voice models with what we now have in these language models makes voice dramatically more useful and more usable. The reason for it is that there are now apps out there, including Super Whisper, which I'll demo through today, that can take your voice input, transcribe it to text, and as anyone that would have used ChatGPT knows, if you make spelling mistakes or you use the wrong words, actually, ChatGPT understands it.

So a large language model is able to take garbled text, which was always the problem with speech-to-text, and actually make a fully functioning paragraph out of it. Which then means when you start to combine it with normal workflows, you actually get to a result that is fully usable.

And so I'm gonna try and run through an example live that is actually useful here.

So I know for some of you that have been here before, who has used Gamma before? Okay, who has seen me do a Gamma presentation before? Not many, good. Okay, so even that is gonna be new then.

I'm going to say, I want to build a presentation about the future of human machine interaction and voice. Now, I'm going to start with that very simply.

So it takes this, I mean, I don't know if I can, I can zoom in pretty well. So you can see here, you can see it takes it pretty well. Like punctuation is there, everything kind of works.

I want to make this interactive, like I try to do every time I do one of these talks. Now, what do you all want to know about the future of voice and future of work? Any particular questions that you would like to see included as we try to build a presentation about this topic?

Sorry, that was the wrong I'm actually still zoomed in. Why is it not? I'm doing the wrong thing because I was using the wrong keyboard shortcut. OK.

So I want to include something about voice cloning live. Anything else? Yep. Very good question. Let me try that. I'm going to try that in the next iteration.

Yes, at the back. Co-working spaces. Let's include something about how this changes co-working spaces.

Oh, no, wait. Actually, I'm changing my mind. Let's talk more about translations. Yes. See, this is exactly what's happening.

Real-time Application

I'm talking to it, but it's getting sent to a language model. It's interpreting it and then actually coming back with the structured text. So I changed my mind. I was talking about translations. That is what it comes up with.

Anything else? Well, actually, I do want to also include something about co-working spaces. Anything else?

I'd also want to include something on how AI can pick up on tone of voice and emotion, everything in relation to voice usage. Now I'm going to say, let's see. Now ask me, actually sorry, there we go. Okay, now one more thing. Now build a presentation plan for me.

Okay, so one of the other things with, I don't know if you've noticed, but with ChantiBT when you use it in the evening, it's slower than when you use it in the morning because the US is using it as well. This is something that you only really get to experience when you do too many live demos like I do all the time.

Okay, so we've got something coming through. Whilst it's doing this, I'm going to go and do slightly bigger.

Okay, so now imagine you are an experienced researcher and you get to brief a junior researcher on one piece of research that they can do to support this particular presentation. I want to focus specifically on how this affects the future of work, and I need a single paragraph that I can simply copy-paste, that the junior researcher can take without any additional context on this particular project, and just go ahead and execute the research.

Walk me through your thinking step by step, and then give me that one paragraph.

So obviously this takes much longer. Had I written this, that would have been the end of the presentation because it would have taken 10 minutes. But let's figure out what it comes back with here.

It still takes a few seconds, but here we go. It's actually still writing the presentation plan.

For what it's worth, this is also how... So I am a software engineer by background. I don't write code anymore, but I do have a software engineering background. This is also how I would use Cursor.

Literally, I'm just talking to it all the time and just telling it, ah, you've messed up again. Something went wrong. Please fix it for me.

This presentation outline is a little bit longer than I wish it were, so I'm not going to stop it because otherwise it's going to start from scratch, but... How many of you have used, sorry, yes, there's a question. An idea for the presentation, yes. Yep, you can do that indeed, so getting it to use a different persona to go and write the thing.

Okay, now it's done here. So I just wanted to, I'm gonna launch the second one so we get the paragraph. Now, how many of you have used Gemini Deep Research yet? Not that many.

Sorry, and how many of you have used Perplexity Deep Research? Sorry. So Perplexity, yes, Perplexity Deep Research as well? Okay, interesting.

That was more than I thought. I know that people use perplexity. I didn't know that many would actually already use deep research.

Deep research was released like three days ago, four days ago on perplexity. It basically is perplexity on steroids. It's perplexity trying to do what OpenAI did and what Gemini, Google did. And we will be using it here.

So I'm just waiting. Maybe I should have done this with Claude. It would have gone a little bit faster. Let's figure this one out. OK, now it's coming.

So the reason I use the outline you're thinking first, then give me the paragraph, it's called chain of thought reasoning. For those of you that are not using it in that way yet, you absolutely always should. Always ask the model to outline its reasoning before giving you an answer. It gives you dramatically better answers.

So here I am going to copy paste this. into perplexity and, well, it's already selected here, but I'm selecting deep research. When I click go, it is taking this research brief And what it's going to do now is it is going to go and go step by step to start producing a report that we want.

So the report that we asked for here, I didn't even read what the output was of ChatGPT. So research how AI-driven voice technology, including voice assistants, live voice cloning, real-time translation, tone of motion, is transforming the workplace focus on its impact on productivity, global collaboration, workflows, dynamics, and so on. So what you can see here,

is it's actually researching, going for the first step, then reading what it comes back with, and then it's going to add another step. You might have seen that it started with 20 sources, and it then added another 20. So it's now at 40, and it's adding more.

It's doing another reasoning step. It's going further. Look at all the steps that it goes through. Now, whilst it's doing this in the background, I'm actually going to go back here, because one of my favorite techniques with ChaiGPT here is now ask me one question at a time, waiting for my answer in between to help me think through how to make this presentation as impactful as possible.

OK. So here we go through. First question, what is the single most important message you want audience to take away from this presentation?

You can build entire presentations and do a whole set of research without ever writing a single line of text. Okay. There we go, that's a powerful and provocative message.

Yes, ChanceGPT. So who is your audience? about 200 odd AI curious people in London, combination of business leaders, researchers, tech enthusiasts, and simply AI curious.

Okay, so I'm gonna go back to deep research, what we have here now. Okay, so what we have, so deep research, so perplexity deep research has now finished. As you can see, deep research is much lengthier in its answer.

It also, in this case, crawled 40 different sources. You have them all on the right-hand side. It is coming back with an answer that references all of those sources.

If you're familiar with OpenAI deep research or Gemini deep research, it actually goes even further. I've had Gemini deep research go all the way to about 850 different websites crawled and then spit out a report that is about 40 pages long that goes in real depth. If you get the prompt right, then it really does the research.

And weirdly enough, When you start using voice, you start taking more liberty with going further. So I have found myself, the more I use voice, the more I start to be explicit and go really deep into what I actually want, where I simply would have stopped a little bit earlier on the text side. Anyways, I'm going to go and take this report now.

I'm going to go back to ChatGPT. I'm going to say, whoop. actually two things.

Let's stop here. I just got the research back from the junior researcher. Please include this in the plan. Okay, I'm going to go back here, copy this, paste it there.

One more thing I'm going to add. No, that's not what I wanted. There's a lot of text now in here. I wanted it

keep your outline really brief so that we don't end up with another wait for this. So it's now going to give me a better outline, I hope. If it works, let me just refresh because I stopped it in the middle. Not sure why it's not executing, so I'm just going to do that again.

Finalizing the Presentation

Okay. So, the future of human-machine interaction and voice. I must say it's crazy when we're already at the point where now when it only outputs about two words a second, I already feel that it's going slow.

I mean, it used to take me hours to build a presentation, but now when it takes me two minutes, it's too long. This is how quickly we adopt to this technology, right? What it's doing now is taking all the data that we got from the perplexity search, and it's actually reincorporating it back into the presentation.

Transformative impact of AI-driven voice technology. 30% reduction in operational costs, 32% lower employee turnover at companies using AI sentiment for analysis. Interesting. Why does it do AI sentiment?

Not sure why that came. I guess that must have been a study on voice. 2.3 times faster resolution of customer service inquiries with AI voice assistance. That's a big one.

Okay, it's going to take another, I'm gonna take one question whilst we're waiting here. Yes, sorry. Good question, so what is the benefit of using, feeding perplexity's results back into Chan-TPT? Chan-TPT has a better reasoning model and it has the rest of the context of the conversation that we started.

So it's like talking to two different people who don't have the context of each other, you kind of have to keep the thing in one place Still, ChatGPT 4.0 tends to be the better reasoning model out of the two. I think on the pro plan with perplexity, you default to Sonnet. Yeah, that would be the thing.

Context and the reasoning model itself. I think we had 12 slides, maybe 13, so we're almost at the end. One more question? Yes? .

Yes, so question, can you set parameters on what you deem to be an acceptable or not acceptable source? If you tell perplexity not to use specific sources, it will avoid those sources. OK, so I think we're now here.

It's doing the very last bit. Sorry for the little delays in the demo. As I said, this is just going a bit slower. This is the last bit that we're using ChatGPT for, so it will go faster right after.

I am going to go and take all of this. And I'm going to copy it. I'm going to go back over to Gamma.

So for those of you that don't know Gamma, actually, I forgot to mention that at the very start. Gamma are also a massive partner of the community, by the way.

Not that I wouldn't demo the products without them. I've actually been demoing it for years.

But an AI presentation creation builder. So I'm just going to hit Create, paste in text. pasting it here. All I'm doing is hit Create.

I'm going to get it to take inspiration from the text. I want my text to stay brief. And in this case, I am looking at nine cards.

Customization and Exporting

Why is it doing nine? I'll just keep it at that for the moment. It's got nine cards there, yep.

And then I just take a random theme, click Generate. Future of human-machine interaction and voice. AI voice reshaping work, productivity, collaboration, future of work, key technology shaping the workplace, AI assistance enhancing workflows, employee well-being, global collaboration.

It even does the diagrams for you. And here you go. We now have a full-blown presentation.

And I can, let me have a look at this. Now, I don't really agree with this one. So can we make this slide more visual?

I'm still keeping true to my promise. I haven't typed a single character whilst we're here, by the way.

So let's figure out here as it goes through. Okay. Now it's making this light more visual. There we go.

So... It's actually not too bad. It is about employee well-being.

I'm just not... Wait, I'm not sure what happened with this person over here. Something weird with their legs here. It's an AI-generated image, obviously.

You can change this as well, by the way, so you can actually go back here and say, like, change the image and stuff like that. These are all drafts, so you can literally do with this whatever you want.

You can actually export this to PowerPoint as well. And then you have a presentation that you can take forward in PowerPoint.

Either way, I am here now at the end.

We have gone through brainstorming. We're literally in a big room. Somehow it has understood everything we went through, and I didn't touch any character on my keyboard to get to this particular place. That's what I wanted to demo.

I've only really started using this about two weeks ago. And I use these tools all the time, but it has already fundamentally transformed how I now interact with almost every single application going forward. And it's really the combination of what these large language models are able to do to really understand what I'm trying to get to and our ability to do speech to text. It hit a tipping point that now makes this all useful to me.

So hopefully this was a decent demo. And I've got time for one question now, but I'll also be here at the end. So yes, go ahead.

Q&A Session

Talktastic. Yep. So the question here was, what happens when you add a pause?

Basically, it just doesn't add anything. So I've had parts where I was literally thinking for like, I don't know, the whole piece of audio would have been maybe three, four minutes, and there would have been like a minute in between where I was gathering my thoughts. Sometimes it adds like two, three dots, but that is it.

So in the interest of making sure that I don't delay pizza too much, I am going to hand over to Alex now, who's going to go over the future of music in AI. But hopefully this was useful. And if you have more questions, come and talk to me afterwards. Thank you very much.