Hey everybody.
So as Drummond mentioned, I'm the new CTO of MindStone and I'll be showing you a little bit about Rebel,
talking a lot about the principles underneath it, a little bit about our vision and sense of the future of work
and what it is going to look like to be effective and how agentic AI is going to feed into that.
What we've learned through training tens of thousands of people, including a whole bunch of Fortune 100s
And I'll probably tell you about some of the mistakes as well,
especially if you ask me gently.
And so I will invite you to ask questions, to interrupt.
If you're gonna throw things, I'd prefer it be money.
And in general,
I just welcome your interactivity.
So the first thing, I won't introduce myself too much other than say,
I trained as a computational neuroscientist, put people in brain
scanners, tried to use machine learning to read their minds, and
then to understand why we forget things.
they say that psychologists study their own deficiencies, so make whatever
inferences from that you will, and then went on study to co -found Memrise, which
reached about 75 million users, it's one of the largest language learning apps in
the world, and then chief data scientist at Channel 4, and I've been just
thoroughly enjoying the work at Mindstone.
It's been the most intense, the
most fast -paced, the most exhilarating and exciting I've, I think, ever been a part
of.
There is a gap, an overhang between the capability of the best AI, which is busily
solving Erdos problems, and the reality, the day -to -day, on the ground, what's it like
to actually use it?
I think there's something really deeply interesting about why we have
have such great intelligences, you know, restricted, imperfect,
but still astonishing.
And yet they're not very useful.
So I'm
going to suggest there are a few things that we need in order for
AI to be useful.
And amongst them, we need well, we're
getting a little bit overexcited here with our animations.
I've
been experimenting with them, as you can see.
So it's just
enormously good fun.
So the first thing is it needs tool use
in these connectors, it needs to be able to reach into your slack
your email, it needs to be able to intervene, send, it needs to be able to
pull from your internal systems.
Without that, it's like you hire a brand new
person, a smart person, on their first day and you're like, no you cannot have an email
address, right.
There's obviously of no value in the workplace or indeed in any
part of your digital life.
So you need to be able to connect things, pull and
indeed interact with the world.
They need a shared memory.
Let's just start with a
memory in general.
It needs to grow and learn and accumulate context and a sense of who
you are and what matters to you and what your goals are.
And if you're part of a company,
it needs a sense that like other stuff going on in the company, it's also aware of it.
So it needs to be in your meetings and aware of what else is going on.
And all of that
should influence it.
It needs to be extensible.
So you need to be able to sort of customize
it and teach it new things and have it run code.
I think this is all pretty straightforward.
However, you may have heard the lethal trifecta.
Basically, if you give an agent access to all of
this good stuff, there is every risk that it might be hacked, prompt injected, or generally
leak your valuable information to whichever company is providing it.
And so we've done an
enormous amount of work to think about this question of safety and privacy.
And I invite
you to probe it, because I don't think we've nailed it.
I think we're best in class, but
in some ways that's not saying very much.
It should be able to do complex tasks.
It
should be able to work asynchronously.
So when you hire someone smart, you don't want
to sit next to them and be like, okay, now do this.
Yeah, yeah, okay.
And then the next
thing, okay, now stop.
And that's what it feels like with most of these interactive
tools.
Whereas what you want to do is have a long conversation up front about the needs
and the goals and the context and the nuance,
and be like, okay, have a crack at it,
stop and ask me if you need help,
otherwise just let me know when you're done.
That's what I want.
And so Rebel, I would say, is not the fastest.
If you want a really quick answer,
you can get a quick crappy answer from ChatGPT instantly.
Rebel takes longer, but that's because
it's checking all of your memories,
double checking against skills that it's learned,
it's pulling from the web and doing research,
and as a result, often what I'll do is I'll be like,
okay, great, I'm preparing a demo,
blah, blah, blah, let's talk about it, off you go,
And then I'll come back five minutes later or even an hour later and in the meantime
It spawned a hundred sub agents and God knows what else and that's the way I really want to work with AI and I'll often
Have a whole bunch of conversations going on at the same time
So I talked about coding and there's a whole bunch of obvious productivity tools
So rebel doesn't yet have an email address, but that's obviously where it needs to go
It has an inbox so that you can sort of
Run through things be like okay great deal with that.
Yeah, tell them.
Yes, I'll come to that
that.
Tell them no, but be polite.
Yeah, can you let so and so know?
Add a linear ticket
for that.
Oh, no, we're not going to deal with that today.
Come back next week.
That's
how I want to work with AI.
I want to be able to set automations.
Okay, every morning, check
all the calendar events I have today, and then do some meeting prep so that when I walk
into the meeting, I know who I'm about to talk to.
What are the five questions that
we've agreed as a company that we should make sure we cover with them?
And often, I'll go
into a meeting with somebody I've never met, and I'll ask all the right questions because
there's a shared memory of all the other people in the company that have already met them,
all the transcripts, and Rebel's basically like, okay, I think what we identified at
the end of the last meeting is we need to push on these two things, and they'll probably
ask about this, that, and the other.
I'm like, okey -doke, I feel ready for this.
And it's
as if I've had a briefing from Josh, the CEO, before every meeting without actually having
having had to waste his time just with information dumps, so the information just kind of diffuses.
And then yeah, some other stuff that we can talk about more.
Okay.
So maybe demo, like
let's actually see this in action rather than just talk.
And there's an old saying in show
business that you should never do work live with children or animals.
I would suggest
guess that you add AI to the list of things never to do live.
It's just only slightly less, less biddable than my cat.
So why don't we start with, can everybody see this?
Yeah, okay, lovely, lovely.
And by the way, this is Rebel, in case you were wondering.
Rebel's voice first.
Hi there, I'm doing a live demo of Rebel at the Oxford Mindstone community event.
I wonder if you could help me.
Now, so in this case, I did prepare a skill earlier.
And if you want to see, I can show you the conversation I had where I said, okay, I want to do a live demo.
What are the kinds of, do some research on what makes for a good live demo.
And go through and find out what other demos have other people done.
And what kinds of features are we keen to showcase?
Oh, and check my presentation so that you can see what I'm about to talk about.
And let's try and incorporate it.
And let's make sure there's something fun at the end and make it personalized.
And I was just like, okay, do all that.
Let me know when you're done.
and we're about to run this for the first time, so we'll see how it goes.
Okay, so who's the audience?
How long?
Okay, fine.
So the audience, they seem
pretty clued in.
It seems like everybody has used all the major models, so they
seem like very AI curious but thoughtful people.
I'd say let's keep it to 10 to 15
minutes because I've also got some slides and yeah and okay maybe I will
also make it interactive and just ask is there anything that would impress you is
there anything that I could show you that would make you think oh yeah this
would actually be useful to me in my work anybody want to shout out what do
you think AI is missing right now that you if you if you had a tool like this
that could actually do it would be useful to you?
That's a great answer.
Okay, good.
Anybody want to throw anything else into the mix?
Okay, good.
I think my short -term memory
is going to struggle with this.
Let's see.
And I also asked the audience, what's one
thing they would love to see from an AI tool like Rebel?
And we got two or three good answers.
One was that it should remember state, so it should know what is the current thinking,
it should be influenced by new input, so that it knows what's gone on recently and is up
up to date.
And then the second thing was about knowing who else in the organization
might have already done a good job of this, maybe that we should keep in contact.
What
else is going on that we should take into account?
I might have missed, you might have
said one more thing that I missed, but, yeah, okay.
All right, great.
And so it's going
to cogitate for a while.
So let me just talk you through what it is that we're seeing here.
So the main interface I think will be familiar to all.
It's a sort of straight conversational
interface, but there's a few little bits and bobs.
So for instance, one of the things it's doing,
it's trying to keep track of what value it thinks it's adding.
So after every conversation,
it does an estimate of like, okay, how much time have we gained by doing this?
So that over time,
we can try and actually estimate ROI.
It also tries to link it to my goals, so it knows what
I care about, what my quarterly aims are, and it's trying to figure out, okay, am I spending
spending my time on high -impact stuff, I'm rather chuffed to notice that the little orange
icon suggests that I am spending lots of time on high -impact stuff.
It's good.
It's not
always orange.
I should say that Mindstone is a training company.
So, as I mentioned,
we've trained, I think, five of the Fortune 100 C -suites.
Tens of thousands of people
have gone through our AI competence program, which is designed to go from sort of zero
to at least fairly heroic with AI prompt engineering.
And what we're trying to do with Mindstone
own rebel is actually incorporate that because the boundary between learning and doing is
blurring, right?
I want to be able to learn as I do and do as I learn.
And so we're starting
to just play with a little bit around that and we're incorporating a lot of the best
stuff from our training into rebel.
And so often I'll get coaching tips like, well, you've
done the same full thing seven times in a row.
Have you thought about turning this into
a reusable skill?
I'm like, oh, that's a good idea.
I do do this every day.
Yeah.
Okay,
great, let's do that.
And it's as simple as saying to Rebel, by the way, let's create a skill for
demos.
And then it'll be like, okay, well, what should go into your skill?
And you just have a
chat for a while.
And it says, okay, I've created a skill for you.
And I've stored it in the shared
Mindstone space, so everybody else can access it too.
And we're like, great, well, I'd like a
customized version for me, where I want to make sure that every demo starts with you congratulating
me on my haircut.
And so my customized demo skill will look ever so slightly different from the
base one.
So we get this notion of shared and yet individualized.
Make sense?
Yes, please.
Great question.
And I bet we can do better.
But the short answer is after every single
conversation, we do a little haiku run which has a careful prompt that says, okay, let's
try and estimate how long this would have taken if you were to do it manually.
But there's
so many tricks to this because it's easy to end up with some hyperbolic, like the first
version would be like, I saved you nine hours.
I'm like, yeah, but that's because I was debugging
a thing that I wouldn't have had to do if I wasn't using this, you know, if I wasn't
using AI.
So, like, I'm not sure you actually saved me any time there.
So, we went through
many iterations to get to the point where actually our customers were like, I think
it saved much more time than it says it saved.
You know, I'm like, great, let's have it be
that way around.
So, it's a prompt that we're evolving.
And as I say, for the impact estimates,
every time when you're on board with Rebel, it'll have a chat to you about what are your
goals?
What are your values?
What are you trying to achieve?
And all of that feeds into
and of course at a company level as well.
Okay.
So what have we got?
Okay.
Rebel thinks
you're a sharp and savvy audience.
I guess I'll be the judge of that.
And so we talked
about okay, great.
So it's proposing a demo flow to me.
All right.
So let's start with
a question.
So this speaks to your question about state.
So what
have you been up to with me?
What have you been
helping me with over the last day?
Have you been, you know, what have we got up
to?
What have our priorities been?
And has anything changed over the course of the
day that we might have changed our views about?
And I think one of the things that
you may notice about voice, perhaps you already know this, is I think we ramble a
bit more, like that's not my, if I was to type I would like to imagine I would come
up with something a little bit more crystalline and clear than that, but we provide a lot
more context.
It's just somehow lower effort, faster, and so because the key thing for effective AI
use is almost always your inputs, the context.
Using voice just means that you end up providing more background, and the LLMs are really good
at making sense of human kind of meandering chat.
and so I suspect this will take a couple of minutes because you can see what it's
doing it's looking through all of my previous conversations interesting didn't
like one of those it's probably checking through a whole bunch of files so we're
using a semantic index so that all of the memories both shared across my
personal life my work life my consulting company before my previously co -founded
companies rebel knows about all of them and it's quickly doing a semantic index
and scan over all of those.
So I expect this will take a while.
It's probably using a bunch of sub -agents,
so we'll come back to it in a moment.
It's a great question.
Okay, so maybe I'll just talk a little bit more
and then we can come back to our demo when it's finished,
just to keep things moving.
All right.
I guess there's one thing that I will say
that has been, in some sense,
the most just kind of astonishing part of working on this.
Hands up, anybody ever written a line of code?
It is an effortful business, I think you will agree.
Incredibly satisfying.
A source of
great joy.
The craft means a lot to me.
But, you know, it's hard work.
If I tell you that
the rebel code base is currently 400 ,000 lines, anybody want to estimate how many person years
that would normally have taken a human?
I did the numbers.
Call it 75 lines of code
a day for a professional developer of a code base this size.
We're looking at 20 person
years of work.
So, for a team of about four or five people, this has been literally three
months to the day.
So, instead of four years for a team of our size, we've done it in three
months.
And that's not to say it's been easy or that it's been plain sailing.
Rebel has
bugs.
It's still in some sense an early stage.
But, like, no joke, 400 ,000 lines of code.
That is a meaty code base.
And it's probably the most complex software I've ever worked
And at this point, there's a whole bunch of stuff going into this, so I'll happily talk about how we're doing this.
Amongst the tricks, I'll suggest two or three.
One, as a programmer, we do not read the code.
We certainly don't write it.
We don't review it either.
1Instead, we have a kind of council of a half dozen different AI models, each one of which has its own lacunae, blind spots, peccadilloes, biases, right?
and they're each quite good at finding different problems and by the time it's been through a six
way review often multiple rounds until they're all satisfied I'm not saying it's bulletproof
but it's often pretty good and so at that point a human review is almost is adding very little
so where we put a lot of effort in is in the upfront design conversation the requirements
gathering the thinking the intention and then we have this enormous workflow that goes all the way
through planning and review and research and investigating the code base and
documenting and testing and by the time it's been through all that we just hit
deploy yeah great question so we attempt to be model agnostic so the AI is mostly
anthropic but you could swap in local models you can swap in open AI you can
swap in a whole bunch of things in terms of the voice again we default to open AI
because they have the best models, they support more languages than I can count
and they do very very well even with non -native speakers so I invite you to
come over and try it later you can put on your most silly accent and it'll work
brilliantly and if you don't like that you can use 11 labs or indeed we have a
local model as well just super quick but just not quite as good and then on top
of that as you use rebel it automatically updates a kind of
vocabulary of terms and names and concepts and weird words that you often
often use, so that it gets those right and spells them correctly.
So how secure is it, do companies admin?
Yeah, great question.
So at the moment the answer is no.
We're trying to strike a balance though.
So we need to provide some kind of reporting to companies,
probably at the level of usage, maybe time saved and impact.
But probably not in terms of the actual conversations.
And indeed, we'll talk a bit about memory in a moment.
But I think that's a really good question.
I'm sorry, lady at the front first, if you don't.
Yeah.
Did we do it in rebel?
No, and there's a good reason.
I don't think rebel is the state of the art for programming.
In fact, we haven't built it for that.
The system prompt is aimed at knowledge work.
And so we use Droid, which is a bit like Claude Code, and
that's what we use for development.
However, if you think about the job of a software engineer, well,
number one, it's changing,
cuz we no longer actually have to write the semicolons ourself.
So the job of a software engineer is moving up the value chain.
it's becoming much more the job of being a kind of product or research engineer and you spend a lot
more time coordinating planning prioritizing thinking about product uh communicating with
other people pulling from slack con you know and thinking about the wider stuff and so rebel's
great for that but the actual business of you know curly braces and semicolons we leave that
to a different piece of software it's a great question i think the short answer is we're very
early in our journey at least for rebel in terms of uh programming i think we've picked the best
models around and the key is Opus as the orchestrator because Opus is wise but
you know GPT 5 .2 for example it's like one of those I don't know if you have a
if you've ever hired a decorator or a plumber and they walk here and they have
a look they're like this was put in by cowboys I can see a half dozen problems
that you're gonna have this is gonna be expensive to fix so basically every time
you ask GPT -5 .2 for a review, that's what you get back, right?
Whereas Gemini's like,
oh, this is pretty good, yeah, it's great.
And so Opus is getting all of this different feedback
and is busily synthesizing, noticing patterns, double -checking.
They're like, well, they pointed
this out, but actually I don't think it's a problem.
And a bunch of them suggested this,
and they're probably right.
And so at the moment, we're kind of just relying on Opus to be a wise
adjudicator with lots and lots of careful prompts.
Have you done the test?
Not yet.
Yeah.
Okay.
It's a really great question.
And one of the, presumably if you were to add 100 reviewers and
a bunch of them are actually adding noise, then you'd be better off with fewer.
So there's a balance to strike, and I don't yet know where it is.
I suspect it's probably more than a one or two, and less than ten.
And at the moment, I would say GPT 5 .2, Opus, Gemini, and
maybe one of the open source ones, I think that's probably the sweet spot right now.
Opus, yeah, Claude Opus, yeah.
Opus is, I think, the best at orchestrating, the best at tool use,
the best at, yeah, it's just wise.
But I think it's not quite as fastidious and sharp as GPT -5.
So every so often they'll do something like, oh, come on, hang on, there is a much simpler way
here.
But there's a trade -off.
So if a human was heavily involved at every point, you probably
would get better judgment, I think, especially if it was a wise...
But at this point, we're
going so fast that none of us can keep up with the code base.
It's very, very hard to
build a mental model of the code base.
And indeed, I think our job right now is to optimize
the factory rather than the output.
So we're building a factory for building software,
and that's where I put my energy.
It's into the system prompt, it's into the
workflows, and I'm kind of hands -off when it comes to the code, which is an
anxiety -provoking state of being to be the CTO and not really have your arms
around the code base.
But like I say, I think that's, we're all ascending up the
sort of chain of being and there's no room to be hands -on.
So actually trying
trying to pull humans out of the mix and only use them where their judgment is most valuable.
Cuz I mean, it might be slightly better code, but if it's, you know, it, it, it, if we'd
only be three times faster, is it worth it?
Okay.
Okay, I'm gonna keep moving, have I got a clock, I wanna make sure I don't go too low,
oh, okay.
So, I will very briefly address one of the things that you're probably thinking about,
which is, okay, how does Rebel compare to ClawedCowork and OpenClaw?
And they both have strengths, and they're both interesting.
I guess in terms of OpenClaw, I mean, it's glorious, it's fun.
It's amazingly inventive, and
I love how it's catalyzed so much discussion.
And it innovates in a bunch of ways.
But right now, I certainly wouldn't use it in a company setting.
And I haven't been willing really to let it loose on my own personal data either.
I'd want to kind of really sandbox it.
it.
And Claude Cowork, it's glorious, it's simple, it's minimalist, it's got a nicer
onboarding than Rebel, it's much easier to understand, but it feels a bit like a sterile
empty room.
And I can give you some illustrations in a minute.
Try it for yourself and see if
you disagree.
So I would suggest some of the things that a really, really effective agent
needs.
It needs a shared memory that coordinates across the company.
It needs to be able to
work across your entire digital life.
So, yes, I work for MindStone, but I also have
a personal life, I might have a family life, I might have other consulting companies, I
might have voluntary work, each one of which has their own email addresses, their own connectors,
their own Notion workspaces, and somehow an effective agent needs to be able to work across
them because these silos are actually, work leads into personal life in all kinds of ways.
So that's built in right from the get -go into Rebel and it substantially complicates what
the agent has to manage and required us to do an enormous amount of work on the safety
side and I'll talk about that in a second.
Finally, rebel is accumulating all these memories and it really feels like it knows me, right?
It feels like it's got all these sort of mementos and information about me,
so it feels like a kind of companion that's known me for a while.
I will briefly touch on some of the inspirations from neuroscience.
In the brain, there's broadly many memory systems, but let's talk about two.
One is the hippocampus, which basically is like taking snapshots, high fidelity moments,
moments captured in detail.
And then the posterior cortex is kind of averaging over,
abstracting over lots and lots of examples and seeing the patterns.
And so that was one of the
first things that we built into Rebel was this idea of let's have specific sources every single
meeting and then the topics that reference those as evidence.
Does that make sense?
And then semantic
search so that it can find stuff by what it's about rather than by, you know, a particular
file name.
Rebel's in every meeting in one form or another.
We have our own
note -taker or you can just use Fireflies or Fathom.
Let's talk about safety for a
moment.
So if in a moment of madness you ask Rebel to email the nuclear codes
to 4chan, at some point Rebel will stop me like, hang on, are you absolutely sure
you want me to send that email to your boss?
Let's just
check that that really is what you want to do.
Oh no, wait a second, okay, let's
not.
I think the other thing is in terms of the memory, because we have this
option of sharing memories, the first thing it does is it writes to a private
space and it says okay I actually think this should be shared with the company,
it doesn't involve salary or HR data or anything like that, I think this is safe
to share, is it okay if I move it into the company?
And you can then be like
actually I think I want to keep this one private or yep move it over.
We talked
about the multi -model council a little bit, we talked about how it's designed
for companies so this notion of automations that run at scheduled times
every morning prepare for my meetings today capture recent transcripts think
about blah blah blah so I have a whole bunch of these automations that run
every day and then we talked a little bit about how it's measuring time gained
I think the final thought comes back to your point about state this is actually
a photograph of the same conversation I had with Claude co -worker I said what
what have we talked about recently?
It's like, well, it's the start of the conversation.
Yeah, okay, it's the start of the conversation.
But what about our other recent conversations over the last day?
It's like, oh, every day is a new day for me.
I'm like, well, okay, that must be nice for you.
But whereas if you ask Rebel, okay, what have I been up to over the last day?
It's like, well, we've had all these conversations.
There was this and there were those meetings.
And then you created this skill and then you had those events.
I'm like, yeah, now we're talking.
Because then I can say, ah, there was something that came up yesterday
about the blah, blah, blah.
Can you remember?
So Rebel can self -diagnose.
It has a sort of notion of introspection.
We talked about sub -agents.
We talked about how Rebel is a coach that tries to train you and help you kind of ascend on your journey as becoming an expert.
And then we have a whole bunch of built -in skills.
So these are ones for investors, for example.
And you can just ask it, hey, do you have any skills that will be useful to salespeople or useful to a researcher?
and it'll happily tell you about them and so I'm happy to talk more about this
but I'll just suggest that these are some of the some of the theses that
underlie the way that we've been building that software is not enough you
have to be thinking about transformation especially if you work in a group where
you have to be thinking about training you have to be thinking about people's
emotional resistance for all kinds of good reasons you have to be thinking
about the frictions and making sure that rebel has access to the tools it needs
or whatever whatever software you use so there's a whole bunch around
transformation requiring enablement and training that AI we are we are no longer
individual contributors because that's not where our value is we are managers
of AI colleagues and all of the skills that good managers have we have to be
learning them we have to be learning how to communicate clearly and think about
what's reusable and break tasks down into sub pieces and all of the work the
good managers do, that using AI effectively requires a change in mindset.
Instead of like
trying to work with just one conversation at a time, back and forth interactively, I'm
trying to bite off big tasks, work asynchronously and often in parallel and just say, hey, go
ahead and do this.
You know how.
Let me know when you're done.
And that in truth, I suspect
that being willing, being effective, multiplying your personal productivity requires spending
spending some money.
So, we probably spend somewhere between $10 and $30 per day per
person.
So, $30 is the upper end.
And that mounts up.
So, what is 30 times 200?
Okay,
pretty quickly that's a few thousand dollars.
But if you have multiplied, like, multiple
times your effectiveness, I mean, that's almost certainly an easy value proposition.
And I
think, I expect that we're going to increase the amount that we spend.
And that that's
That's the difference between someone who's like, yeah, I'm 5 % more effective because
I have a slightly better Google, and someone who is able to just work at a whole other
level because they have a team helping them.
I'm conscious of time, so I'm not going to wrap it on.
I would happily go back to the demo at some point in the future and show you where I was
going to take it, send a personalized email to everyone in the room, do research on a
random volunteer, and create a little personalized presentation directly for them.
These are the kinds of things that we're routinely doing, that our sales team are doing, our marketing team, our product team, and so forth.
So for now, I'll just say I would love to talk to you about this.
I would love to hear about experiments that you've tried that you feel actually could help us and inform us and to hear about anything that doesn't sound right and have a healthy debate.
So for now, I just want to say thanks ever so much for your attention.