This is how I build with AI

Introduction

Hello, everyone.

My name is David Beatty.

I'm a computer science graduate from Oxford.

Speaker background and context

Yeah, I'm going to run through my development workflow.

Yeah, and as Drummond mentioned earlier, I was lead developer at Shuttle, which was a startup in London that was acquired by eBay.

And then myself and Fraser here, we created a new SaaS product called Nickel, which was bought in 2021 by Omniplex Learning.

The AI inflection point in software development

And then obviously the world changed in 2022 when ChatGPT came out and that was like mind blowing change to software and how computers used.

So I went to study a course briefly at Oxford and how LLMs are made.

And since then, there's just been loads of changes, at least on the development side, on like how we go about coding.

And basically for 20 years, coding didn't really change.

I mean, technology came and went, frameworks went in and out of fashion, but it was basically like hand coding in IDEs, which is like the software developers use, the integrated development environment.

You hit some problems, you do some Googling, you might go to Stack Overflow, which is a

Engineers for them where people can post questions and other people post answers And this was kind of like how you'd go about like debugging the problems.

You probably go to the helpdocs Of different sites like maybe you're trying to do a stripe integration and you're looking through these hundred documents trying to work out Why your integration is not working and that was kind of like my life for 20 years but then

It's been rapidly changing.

Evolution of AI-assisted coding

From autocomplete to copilots

Maybe like a few years ago, we started getting like smart autocomplete.

So you start writing out a function and then it would like prompt you with like what could be the implementation of that function and then you just like tab and then it's implemented kind of like autocomplete on your phone if you're like WhatsApping or sending an email.

Then there was a co-pilot period where it was like pair programming.

I don't know if you guys are familiar with that term, but it's where you work with another developer on the feature.

You had a genius pair programmer where you could ask it questions.

You could say, how do you think we could refactor this function?

what do you think this function is doing?

Is there a better way we could do X, Y, and Z?

And that kind of lasted like a year or two.

Entering the agentic era

And now we're kind of moving more into the agentic era, which Glen was mentioning earlier, where you can literally just give it one prompt and it will go away and implement it.

Well, kind of, if you don't guide it, how it sees fit.

So that's kind of like what's been going on and why we're talking about this and how things are changing really rapidly, like Glen has said.

My current AI tool stack for coding

So this is kind of a breakdown of my tool usage from an AI point of view when I'm coding, and we're gonna go into each of these tools.

Claude Code in the terminal

So there's Clawed Code, and that's like my main coding platform that I use.

It's in the terminal, and we'll have a look at that.

That's probably like 50% of my time.

And the benefits of core code is it's really quick and reliable, high quality model, strong model, has great taste.

So some of the design stuff that Glenn was showing you in his application looks really nice, but he probably didn't even have to say, make it look nice because it has great taste built in.

And it's good generally at problem solving.

Cursor: AI inside the IDE

Then there's Cursor, which is built into, well, it's a fork of VS Code.

And then they've added loads of AI integration on top of that.

And that has benefits like it's integrated into the development environment where

the developers are familiar with, you get good visibility of the whole code base, so you don't lose that kind of overview of the structure of your code.

And you can choose the model as well, which you can't do, well, you can do limit in Cloud Code because you only have the anthropic models.

OpenAI Codex: asynchronous VM-backed tasks

Then I'm going to also kick off a task in Codex, which is by OpenAI, and

The great thing about that is it runs in a VM, virtual machine, on the server.

So it's something you can kick off in the background.

So I always find I have more ideas than time in the day.

So before bed, I could kick off some tasks.

Or before I walk home from work, I'll just kick off a task to run in the background.

And then by the time I'm at home or awake, I can check out the changes that it's proposing.

There's some other benefits as well, which we'll look at when we go to Codex.

Vercel v0: UI generation with built-in taste

And then probably the one that I use least is v0, which is by Vercel.

but I think it's quite a useful product because I think while the Vercel guys built Next.js, which is a popular JavaScript framework, which hopefully some of you are familiar with, and they have built in excellent taste.

Behind the scenes, I think they're using Sonic 4.5, but they're injecting, like, you might ask for a login form, for example, and as part of that, I think they inject good examples of what a login form would look like so that the output is even better than just your prompt.

Live walkthroughs and best practices

so we'll just quickly run through some examples of that and also some of the best practices and problems that I've seen with people using AI coding obviously this is a little bit technical and we're going to be looking at the terminal and an IDE so if there's any questions or you have any if you're getting lost then just shout out and I'll do my best to explain what's going on

Using Cursor: ask, plan, and agent modes

So this is Cursor, which is probably the most familiar for developers.

And you can see if you click this button in the top right, you can get the AI Coder, Copilot app.

And if you click new chat, you'll get a few options.

This is agent mode where it's going to go away and attempt to program something on your behalf.

There's ask where it's not going to make any changes.

And there's a new area called plan where you can plan out the changes that it's going to implement.

So you could say something like, oh...

we're running a product called leads.new and you can basically, um, we know like pay-per-click traffic is working.

So let's say we, we wanted to expand our SEO presence.

Now we know search traffic is working.

So you could say, uh, review the app marketing site and come up with ideas for landing pages.

to attract traffic.

And here's where you can choose the different models.

So these are the anthropic ones.

This is OpenAI ones.

Here's the Google one.

And here's the new one, which I haven't tried yet, which is the cursor one that Glenn mentioned earlier that literally came out like the last day or so.

Why planning improves outcomes

Yes, and then it's going to come up with a plan.

One of the things that I've heard people really struggling when they're trying to adopt AI-based development workflows is that they'll say that the AI came up with a bad result or they're not happy with it, it's only got rubbish output.

1But I think if you take the time to do the planning in advance, you can get alignment between yourself and the LLM.

You can review that plan, and then when you're happy with the actual plan of the implementation, then you can say, okay, let's hit it.

Let's actually code it.

For example, I've got a plan up here.

So we can just see, and I've told it to implement it, but we'll just have it to do that as well.

And because the planning is such a critical step, they're now building in planning into these solutions.

So like a couple of literally like two or three months ago, this wasn't here.

And you'd have to like use your own custom planning software and approach.

So there.

Yeah.

Executing a planned change

Let's just say can you implement the plan at marketing?

If I can spell

We'll just put this for the moment and see if it goes ahead and does that.

That's basically just adding this plan that we developed and we got alignment on to the context as well as can you actually implement it.

It'll go away and do that.

You can also add a best practices document or whatever you want.

It's so fast.

I haven't used this one before.

Let's not celebrate.

Let's see what comes out the other end.

Yeah.

Grot code is also super fast.

Do you try Grot code?

Now we'll move on to, so that's kind of like Cursor, and like I said before, it's got the structure of the code base over here, so you can kind of still jump around the files and find what you're looking for, and it's also better at fuzzy search and pulling in relevant context and things like that.

Using Claude Code from the terminal

Then we'll go into Claude, because this is a bit of a bigger step away from what developers are used to.

So it's in the terminal, you just type Claude,

And then this is called running.

So and then you can basically, there's some cool fans, I know.

And then you can say to Claude, well, there's different things you can do.

Reasoning, auto-accept, and planning modes

So you can turn on reasoning by just pressing tab.

And hopefully, most of you are familiar with reasoning models.

But basically, there'll be some reasoning tokens where it thinks through the problem before it starts on the output token.

So you're more likely to get like a

better solution or higher quality output and as you can see I'm approaching my weekly limit so I'm not going to turn on the thinking at the moment you can also shift pan so you can auto accept edits which basically means we're not even going to review the output we're just going to like hit it

or you can go into planning mode which is what I was talking about before which is where it's really important to do the planning so that you have the alignment especially if you're working on big and complex features that's when you really want to get the alignment done in advance otherwise

It's just going to have a crack at it.

And because they're stochastic, like Ben was saying, let's say I ask it to do a login form.

It might do an email and password like 70% of the time, but maybe like 20% of the time it chooses OTP where it sends a code or 10% it will like off with Facebook or something like that.

So this is why you need the alignment in advance.

And let's just have it to go ahead and can you implement the...

planning marketing crawled landing pages.

So I kind of separated out the landing pages.

So this is now running in the background.

I'll just tell it to auto accept edits.

And

Letting the agent work across your codebase

Core code can do all the things a developer can.

So it can search in files, you can find files, it will read the files, and then it will start making the changes.

And we can just leave it doing this stuff.

And you can see that's what it's doing now.

It's learning about my code base, understanding the task that it's been assigned.

And then we can go into Codex.

Using Codex with GitHub repos

So hands up if you've got a chat GPT subscription, like $20.

Great, because basically with ChatGPT, you get codecs, it's just here.

And then with codecs, you can connect your GitHub repo, which is where you put your code.

And once it's connected up, you can give it a task.

So please can you come up with three landing pages.

for leads.new, and then I might have some preferences around these landing pages, so I might say use the jobs to be done framework, or what's that brand one you like?

Brand story framework.

Story brand, story brand, whatever.

So as long as you specify it, you're probably gonna get a better output.

Parallel variants and branching

so we'll just hit that kick that off oh you can also like cause you can branch it from a different branch if you want to which is just a version of your code but what's also really magical about codex is you can get it to work on multiple versions in parallel which is really powerful if you're like working on like marketing pages you like it yeah let's say it's your like

important landing page you can say create free versions and then you'll get to see the the code for that in advance so we just kick that off as well so now we have free agents working although the um the composer one was so fast it's already finished so this is the new one uh that cursor released the other day it was so quick let's have a look at the cursor one

Reviewing generated landing pages

so this is just running locally and yeah this is our application so for context it's a lead magnet generator so let's say I don't know if you're familiar with the term like a lead magnet but basically let's say you're launching a product you probably want to create a

free mini apps to give away as part of your product and capture leads for your business.

So our software helps you do that.

And now it's created a few landing pages based on this plan.

Let's get the plan up as well.

And then we can go and check what it's actually produced.

So it's suggested we do it for real estate agents.

So we didn't have any of these pages before the start.

And you can see,

It's kind of created a nice looking page.

Really quickly and easily, it decided to create this.

I'd probably then go ahead and make some changes and brainstorm, I don't know, maybe different calls to action and things like that.

But you can see this is really quick to generate lots of relevant content for your site or business.

And that's what it's been building.

But normally I'd be building more complex features, maybe like, I don't know, like estimating the calories in the chicken or whatever, something like that.

I don't know why it's asking me about this stuff.

But call code is still struggling and reading stuff.

It's looking at the marketing page structure.

It should start implementing it soon.

Let's go to...

Let's go to Codex and see how that's going on.

Codex tends to be a bit slower, just generally.

I think their model is slower than the Anthropic ones, but it's really good at problem solving.

So one of the benefits of Cursor is you can kind of choose, sometimes the Anthropic might be under heavy load, the service goes down, or there's a particular problem it's struggling and just not getting.

So don't be afraid to go to Codex or go to Cursor and try a different model to try and solve your problem.

So you could then go to GPT-5 or GPT-5 Codex or whatever.

But it does tend to be slower.

But the great thing about the Codex is, one, it's already in your subscription to ChatGPT.

Nearly everyone had one.

And it's free.

They're not even charging you, so you might as well build loads of stuff in there right now.

Trying Vercel v0 for redesigns

Great, and the other site, while this is still working, that I wanted to show you, was v0, it used to be .dev, but they've changed it to .amp, because it used to be more of a component builder.

And if I just go to one of our sites, there's a threat analysis, whatever.

so this is just going to generate something I just need like a UI I'm just just ignore the site I'm on at the moment it's just building and what I'm going to do is take a picture of this current UI then I'm going to put it into v0 because it has better taste and ask it what it thinks I should should build instead and I find that's a really great way of like brainstorming ideas for UI UX because a lot of the time if you're just like rushing and building and building and building

like the UX and UI can be a bit of an afterthought, especially if you're using like a vibe coding product.

Whoops, don't want to go into that.

List of domains.

And then I'll go to vzero.

I'll upload the image we just took, which I'm not happy with that page.

And I'll say to it, this is a lead magnet on our marketing site, but it lets people check if the domain is under attack.

Please can you redesign the page?

as a lead magnet.

You can redesign the page and make it look a lot nicer.

And capture user details.

And it'll go away now, it'll look at that image, similar to how Glen was talking about earlier, and then come up with a new, it's gonna build a whole new app, but if it didn't, it was gonna create a React component,

which is like a JavaScript library.

And then we can import that component straight into our code base.

Or we could take another picture and tell core code or Sonic 4.5, look, this is our current design, but we want it to look like this much better version that it should output.

So we'll just leave that running as well.

Let's just check how core code is going.

Yeah.

When to use ‘vibe coding’ platforms vs local tools

And I guess some of the benefits of doing it this way, so the vibe coding stuff, which is the AI Studio, Replit, Bolt.new, Lovable, is really good for getting started.

So if you're new to software development and you haven't built web applications before,

that kind of vibe coding platform is really accessible.

You can just go to that website, describe what you want, and it's gonna create a nice looking marketing site.

The problems I've found with them, I haven't used them in a few months, but they tend to become less effective as your application grows.

So they'll get to a certain point where they start grinding to a halt and they start really struggling to understand what you want it to do.

At that point, I would pull it down.

I'd put it into GitHub, the code repository, and then pull it down into one of these tools.

So I'd pull it, then start using cursor and code and stuff after it started slowing down.

But I'd probably use something like that maybe to bootstrap the app, just to get started really quickly.

And then I'd bring it down when it started to struggle.

It's going to take a little while to build.

It doesn't look as nice as I was hoping.

I haven't seen it do better than this.

It's still amazing that they built all of that just based on that one prompt and screenshot.

You can get all of the code that's just output and import it into your project.

It's a shame it looks rubbish, but you could make it look a lot nicer.

Use Tailwind.

Add more color.

Please.

This is embarrassing for me.

Then it would go away and do that.

You can just chat with these systems and you can easily revert as well.

If you have built something and it's not doing what you want, don't be afraid to just trash it because it's so quick at building this stuff now.

You should be able to quickly iterate, not heading in the right direction, just kill it, start again.

tweak your prompt, come up with the plan in advance, and then you should be good.

Caveats: security and unexpected changes

Other issues that I've seen with the vibe coding stuff is if you're not careful, like here, for example, let's say I made a call to OpenAI to analyze whether this domain is a threat.

I've seen instances where the API key has been leaked on the front end.

So you really need to, before you go live, you need to check what's been actually produced.

Yeah, and they will also do occasional crazy things where, let's say, there's a problem in your code base.

So it might decide to, I mean, they're way better than they used to be, but it might decide to just delete that file or just turn off type checking, which you don't want to turn off really unless you know what you're doing.

So you still need developers to some extent to at least check things and understand what's going on.

Cost realism and subscriptions

And the other thing that Glenn mentioned was a lot of companies think that this is going to cost their developers like $20 a month when really there's multiple subscriptions.

You're going to quickly hit limits if you only pay $20 a month.

And the business cases that people are putting forward aren't realistic.

They need to anticipate much higher costs.

And they can use Glenn's app for that.

unfortunately yeah core code is still working away but it would have similar output to this is because the build is restarted one second just going into the terminal just going to stop it because it tried to do a build

Recommendations and takeaways

Yeah, so I'd generally recommend core code.

Then if you want to see the whole of your code base and choose models, cursor, codecs for asynchronous tasks, so you're going to sleep or you're walking home, and then V0 for design despite its poor performance tonight.

This is what I'd recommend.

Speed gains and quality control with AI

And then I think if you do that...

You can move way faster So I would say you could build like a production quality app in about six weeks instead of six months And even like the output can be like way better.

Like I was running a development team um in my own and like

you almost can't like if someone's worked on something for a month and it's not quite what you want you it's really hard to like put aside human feelings and say you're like 80 of the way there this would make it way better where if you like the ai doesn't care like you can just tell it you need this better this is poor um yeah and you can rapidly generate products and landing pages um and you can see how it can accelerate your development

Resources and Q&A

Communities, podcasts, and newsletters

For more information, I recommend joining AI-focused WhatsApp groups.

Me, Glenn, are you on there, Graham?

Graham are on the AI prompts.

WhatsApp group.

I've got an AI Bros group.

Listen to the Latents-Based Podcast, which is for AI developers.

So you can listen to that on Spotify or Apple Podcasts, wherever you go.

And sign up for AI newsletters like AI Valley.

I'd probably just sign up for one or two because there's a lot of overlap there.

Yeah, that was it.

I think we've got time for one quick question.

Okay, bogged down.

Oh, you remember my name, oh my God.

Q&A: What runs on the backend?

Yes, so the cursor app, you run it into your local host.

What's in the back end?

What's running in the back end?

Next.js.

Okay.

I don't know if you've used Next.js.

Have you used it much?

Okay.

Yeah.

Stack discussion: Next.js benefits

So the, yeah, a lot of, so I used to, so for Shuttle it was Ruby on Rails was the stack and like prior to that it was like PHP and stuff.

But like, I think Next.js is like a massive breath of fresh air because it has like the front,

A lot of apps, they ended up splitting, so you ended up with a back-end code base, probably Python, and a front-end code base, like React, Vue.js, Angular.js.

With Next.js, everything is all packaged together, and it's way front-end component-focused, and then it just exposes APIs.

When you deploy it to Versailles, it will just sort out your back-end.

You don't even have to worry about the serverless functions that it calls.

There are some challenges around how long the serverless function will run for before you're

your server is killed but generally it's like amazing for productivity you don't have like multiple services or multiple like back end front end and like anyone can code back in the front end in a component design so i'd highly recommend that