Why is an AI talking to me?

Introduction

All right, so my name is Olly Christensen.

I am, yes, it's my birthday,

and I am presenting why is an AI talking to me.

So a bit about me is I'm currently developing

voice AI agents with my co -founder,

and I want to start to talk about

how voice AI has developed

and how you can actually start taking advantage of this

and implementing it into your everyday workflows.

From Reactive Assistants to Agentic Voice AI

So we all know who this is.

That's Siri.

So Siri, Alexa, and Gemini, they're all what we call kind of reactive agents.

So you say something, they react, and that's it.

But now we're starting to develop more autonomous and more agentic agents.

1And with that comes three crucial elements within that.

So we've got context, latency, and emotion.

So these three rules is how we determine what is a good voice AI agent.

The Three Ingredients of a Great Voice Agent

Context

So we'll start with context.

So what is the business context?

Do we have a voice AI agent

that is as a customer service agent or maybe an outbound sales agent?

We also need to think about

what is the contextual awareness of the agents.

Can it remember my name?

Can it remember maybe

the complaint I had last week?

Latency

Then we need to start understanding the importance of latency.

latency.

So for a human conversation, we have around 300 milliseconds.

So that is me saying

something and then you responding back.

So for a good AI project, AI voice AI project,

we need to aim for 300.

So a demo will sit around about 800, a good project will hit

500, and a really good project will hit 300.

Emotion

And then we have emotion.

So what emotion

does the agent have?

Does it raise its tone of voice at the end of sentences

when there's a question?

But we also need to understand, can it detect emotion?

So

if I'm screaming down the phone and a voice AI agent annoyed about my customer

service, can it detect that emotion and can it look to mitigate my stress, reduce

it and get rid of it?

A Practical Build: Your First Voice AI Agent (No Code)

So what I'm going to do now is run through a practical

demonstration on how you can build your very first voice AI agent.

I've kept it

it as simple as possible.

So there is no coding.

It is just prompting and one API key.

So the platform I will be using is Retail AI.

So Retail AI is a lovely platform.

It allows you to basically just prompt out

what the agent is going to do, and then

start to implement some of the rules we've talked about.

Defining Jordan’s Job: Calendar Management

So in this scenario, I've developed a voice AI agent

called Jordan, and Jordan's going to manage my calendar

because I'm a busy man and I can never manage my calendar.

And Jordan is going to help me do that all.

Prompting, Tooling, and Guardrails

So we can see here we have it prompted out.

Does everyone kind of understand how prompting works?

Yes, a few nods.

Okay, so we've put in markdowns what I want Jordan to do.

So we've got the role of Jordan, the toolkit, the preferences I have,

so time zones, the guidelines to stop it hallucinating,

and the operational workflows.

Tuning Performance: Latency and Emotion Settings

So if we remember those three crucial rules,

we've got the context now,

we can start to look at the latency here.

So you can see it's running around about 820 milliseconds.

If we want to reduce that, we can then select a different LLM.

So usually the more expensive per minute,

the better latency it is.

Now I'm doing this off of a browser,

so there will be a little bit more latency,

but if we were to host it ourselves,

we could have a reduction in latency.

And finally, we have emotion.

So we come here to the agent handbook.

We can start to play around with some of the emotion

and make it sound more natural.

So we've got a personality,

so we can start to add in maybe a professional personality.

We've got our natural filler words,

so the ums and the ahs we have in natural conversation.

And we have what I mentioned before, high empathy,

so detecting when there is stress or potential anger.

So what I've done here is I've given Jordan the functions via the cal .com API key that I'm using for my scheduling platform.

I've simply just connected it there and that allows Jordan to now start using some of the tools.

So we go to speech setting, I can then start to mess around with the response of Jordan and how fast Jordan starts after I've finished.

And I can also look at the interruption sensitivity.

sensitivity.

So how fast does Jordan stop talking when I interrupt it?

So without further ado,

we will test it, see what Jordan can do, and let's go.

Live Demo: Booking a Meeting with Jordan

Hey, Jordan, how are you doing?

Hey,

Jordan, how are you?

Hello, this is Jordan.

I'm here to help with your scheduling needs.

How can I assist you today?

Wonderful, Jordan.

So in one sentence, please tell me what do you do?

I manage Ollie's calendar through cal .com to make scheduling meetings smooth and efficient

for everyone.

How can I help you today?

Okay, Jordan, I want you to book.

We'll do a meeting

with me and my friend Matt for next Tuesday.

Let's do next Tuesday.

So the 28th, please.

Got it.

Could you please tell me the preferred time for the meeting on Tuesday, the 28th?

Also, may I know your time zone or Matt's time zone?

Sure thing.

Let's do 10 to 10 .30, London time zone, please.

Thanks for that.

Just to confirm, the meeting is on Tuesday the 28th, from 10 to 10 .30 a .m.

Wait, wait, wait, Jordan, Jordan, Jordan, Jordan.

Sorry, sorry, sorry.

Jordan, can we please actually include agenda in this?

I need an agenda with Matt.

Let's do, okay, number one, we'll do catch up.

Number two, we'll talk about Mindstone.

Number three, we'll talk about pizza.

And number four, we'll then just go close out.

Can you also add in a Google Meet link as well, please?

Thanks for the details.

I'll include the agenda with those four points and add a Google Meet link.

Can you please provide your name for the booking?

Just put it under Ollie and Matt.

And for Matt's contact details, just use a placeholder email, please.

Thanks, Ollie.

Ollie.

I'll use a placeholder email for Matt.

Let me check availability for Tuesday the 28th

from 10 to 10 .30am London time now.

The slot on Tuesday April 28th from 10 to 10 .30am London time

is available.

Should I go ahead and book the meeting with Ollie and Matt including the agenda

and Google Meet link?

Perfect.

Do that Jordan for me please.

Yes.

Your meeting with Matt on April

28th at 10am London time is booked with the agenda and Google Meet link.

I ran into an issue because

because Matt needs a valid email or phone number for the booking.

Could you please provide a real email?

All right, Jordan, just use a placeholder.

We'll go Matt's at placeholder.

Thanks for that.

I'll use Matt's at placeholder .com as Matt's email and book.

The meeting now.

It seems the booking system still needs a valid email for Ollie as well.

All right, Jordan, don't worry about it.

We won't do a booking.

That's fine.

No problem at all.

If you need help with scheduling,

What Went Wrong (and Why That’s Useful)

But we can see in a real world that may happen.

But what I've got here is an example that I had before.

So what we did is we had the placeholder email,

which for some reason John didn't like this time,

but we had the Google Meets,

and we also had the agenda as well,

along with the ability to reschedule and whatnot.

So we can see how with this being a very, very simple agent

and just prompting, we run into errors.

as.

So we need to start increasing and having it a bit more complex and increasing the technical

ability.

Beyond Scheduling: Real-World Use Cases

So, we've talked and I've showed you a little bit about kind of the use cases.

So we could have Jordan start to do more content creation.

So we could take that voice, use

text -to -speech and start narrating maybe some social media posts.

We can also do that the

The same for dubbing as well.

We could also start to add more kind of employee feels.

So use Jordan as a customer assistant

and increase maybe using inbound and outbound calling,

and also increasing it with an example of Jack and Jill.

So Jack and Jill is a startup where

you speak to Jack, the voice AI, and Jack assesses your CV

and your expertise, and will then pass you on to Jill

if Jack likes you, and Jill will then connect you with potentially recruiters or interviewers.

Looking Ahead: Multimodal Agents and Invisible UI

Then we start looking into the future, so what is the multi -modular?

This is the ears

and the eyes, so can we have an agent that sits there on my computer, can see what I'm

doing, but then can actually go, hey Ollie, maybe you've got this line of code wrong,

let me fix it for you, and I can communicate back.

Then looking more into the future, we've

you've got six, so invisible UI.

So can I say to Jordan, please create me a voice AI agent

that's just like you, but change it

for the context of my sales team and make

it have a different accent?

So that's what we can start doing.

Now, we understood the context, the latency, and emotion.

So I showed you a very simplistic AI agent

with just prompting an API key.

We ran into some issues, and that happens.

Making Agents Reliable with MCP Servers

But we need to start using MCP servers and actually increasing that, increasing the technical aspects, but also the complexity.

So for those that don't know, an MCP server is a module context protocol.

So imagine a human body.

We've got our brain.

We've got our hands.

So in order for us to use that, we've got a nervous system that connects the two.

So the LLM is our brain, and the tools are potentially cow .com, Salesforce, HubSpot, and so on and so forth.

So we use the MCP server to connect that from the LLM to the tools.

So in the case of Jordan, Jordan got that wrong and couldn't do it.

But if we actually connected Jordan via an MCP to potentially Salesforce, our CRM, we could actually take in the correct contact details.

else.

We could actually maybe raise a ticket as well and say, OK, well, I need to remember to do

that.

We could even connect that to Gmail, say, and send an email with documents we may need for

me and Matt.

Design Principle: Automate Admin, Not Human Connection

So the last two kind of summarize it together perfectly.

So we've got opportunities

here, but it's important that we have that human connection.

And that human connection

is really, really critical.

And then we'll go and talk a little bit more about this

as the night goes on, 1but we need to automate admin, not the human connection.

So can we

add in escalation routes to ensure when mistakes happen like that, a human takes over and can

deal with it?

How to Get Started (Guide, Integration Path, and Next Steps)

So what can you do?

You're probably thinking, well, how do I actually use this?

I've put in a free guide here in the first QR code so that you can start to actually

understand how to use it, implement this from simple prompting, and then take this all the

the way to actually integrating MCP servers and increasing

that technical aspect and making it more reliable.

And by adding in MCP servers, we actually

increase the context awareness.

So now Jordan can actually remember what we're doing.

She can take maybe the contact details from Salesforce.

We increase the emotional awareness,

and we reduce the latency.

Conclusion and Contact

So right there is my LinkedIn.

in.

I'm happy to connect with everyone and talk about what everyone's doing and how everyone's

growing in this community.

Thank you very much.

I've been Ollie Christensen.

Finished reading?