Great.
Thanks very much for having me out again.
I've usually talked about technical topics like
reinforcement learning, philosophy of AI, stuff like that.
I've played for the last several months
playing with these coding agents to see what they're really capable of.
And so I just wanted
to share my experiences doing that today.
I've been in machine learning, whatever you want to
call it, for 35 years.
When I was doing my undergrad and grad at U of T, Jeffrey Hinton
was a professor there.
I'm an aerospace engineer by education, so Jeffrey Hinton would hold
seminars on neural networks and machine learning in the 80s.
So I attended all of those because
I was fascinated by the subject.
And then after I graduated U of T in aerospace, I didn't
want to build weapons of death and destruction.
So I didn't want to move to the United States
either.
And so I found this accidental career of bringing more math and science to business.
I was shocked at how the tall towers on downtown, on Bay Street, how little math and science they used back then and how little math and science they actually use today to run their businesses.
So it's been this accidental career of using technology to automate business process and improve decision making.
So today I'm going to talk about my practical experience with coding agents.
I've done like a whole pile of projects and I use it for my work.
I do consulting.
consulting.
I had my own company for 20
years where we were automating
retail planning and automating
fraud detection and medical diagnosis
so like no human in the loop reinforcement
learning is what we did.
The company
was sold a couple years ago.
So
since then I've been doing consulting.
I've been
doing consulting, helping
companies implement AI and
building roadmaps
and doing
some fractional executive work.
I'm going to do
some description of the projects I've worked on, what you can expect from
these coding agents from my perspective, what the process I have used to use
these agents, give some examples of good context, and then just give some tips,
tricks, and then just my general conclusions overall.
So here's just some
examples of things I've built in the last couple months.
So the first one was
kind of a web app that scrapes emails so this is like your Starbucks your oven
broke you send a service request email to your to your service provider and
that email has to get entered into an ERP system and the ERP system generates
a work order the work order sets the technician the technician runs out the
door and has to show up within four hours to fix the oven so that's the
cycle today there's people standing there receiving the email reading the
email typing it into an ERP system manually hitting enter making lots of
mistakes all kinds of stuff like that so I've used AI to use technology I don't
like I hate the term a I use technology to scrape the email and extract all the
information required by the ERP system and connect to the ERP system using API
eyes and enter all the data into the API into the into the system there's a human
being who has a chance to review the scraping to make sure that it happened
correctly and then they just hit submit and the data gets put into the ERP
system and the cycle time is reduced from like you know ten minutes down to
like one minute by using the technology to help automate and and that's one
thing I did I build a voice interface on that as well so if somebody from
Starbucks phoned phoned for a service request to have an intelligent IVR talks
back and forth to get the same information that you would put in the
email and then it populates the ERP system all that done was done in like
Django Python my sequel open source tools and I tell people I could have
done this 30 years ago like the technologies existed forever this is not
new I find very funny that the work I'm selling now is stuff that I sold 30
years ago now there's just hype around people need to do it like first thing
you need to do is organize your data and automate processes and all of this stuff is stuff we sold
when I ran IBM's data warehousing and machine learning practice 30 years ago so I find it kind
of interesting full circle I feel like my 30 year old ago self today with all the work that I'm
doing I tried something more complex doing a CFD grid generation so that's computational fluid
Blue Dynamics.
I tried to take my master's thesis and see if I could build part of that
with AI, doing it in Python, super math -oriented type code.
I'll show you examples.
One of
the things we did at Daisy, we automated data warehouse management.
So I tried to build
a metadata -driven ETL framework.
So all the ETL required to build a data warehouse where
all the code is stored in the database and parameterized, and then it's assembled on
the fly as as you receive data that way you don't have to manage if you're managing 20 data
warehouses you don't have to have 20 copies of the code base and you don't have to go in and
manually edit code files so that was something built in python shell scripting mysql hive
then i tried doing infrastructure as code so using azure using terraform if you're familiar with that
to build infrastructure automatically so all of the above applications i built are ephemeral
So I tried to build those using Terraform with Azure CLI scripts with a separate VM to submit jobs from the ETL metadata framework using Docker to containerize this so I can install it anywhere and did all that with Terraform and Cursor.
So those are some examples.
One other interesting thing I did was I got lazy doing an RFP response and I wanted to see if these coding agents could respond to the RFP for me.
So I submitted the RFP, gave it a whole pile of examples, and generated some text that was interesting.
So I'll share some of my learnings executing these kind of projects, which I did over the last couple months.
So the first thing you need to know what to expect is if there's a lot of information available on the Internet
and lots of documentation, like on StackFlow and vendor websites and all of that stuff,
then the agents will probably do pretty well.
So if it's really vanilla stuff, like I think one wheelhouse is if you have a web app that talks to a database through a REST API, that seems to be like a total no -brainer for these AI agents.
Probably because there's 800 trillion websites on the internet, everybody talking on StackFlow.
I had this error, what should I do?
I had that error, what should I do?
And all these answers and all of that stuff.
So when there's a ton of documentation available, they tend to do pretty well, right?
Right.
The more specialized the subject matter is, the more difficult time the agents will have.
Right.
So in those cases, those cases, you just need to use the agents as kind of maybe code support where you do most of the coding and most of the thinking.
In CFD, there's some online material, but not enough because that was a horrible failure.
That I'll show you what the AI code achieved.
achieved and one final note is like if you can't write the code yourself it's never going to work
like you have to know that I could do this myself but I'm too lazy I'm going to get this agentic
code thing to do it a little bit faster than me because if you don't know how to do it you don't
know how to adequately prompt it's all about prompting in context and so if you don't give
it good prompts and good context you will get garbage out it's like the classic garbage in
garbage out, nothing new, same old shit that we've been talking about forever, garbage in,
garbage out.
So if you can't code it yourself, good luck.
Unless you're doing something super
duper simple like build me a recipe or make me a two -page website, something like that, right?
If you're doing anything significant, you need to architecturally understand exactly what you're
doing.
And you can expect the agents to get stuck.
They like totally fail.
I think they're like
like senile people, that's, you know, my parents went through that dementia and senility, and I
find that working with agents is a lot like that.
They're lucid at moments, and then they lose
context totally.
You're not having a conversation with them.
These LLMs think one token at a time.
That's their thought space.
One token forward.
That's it.
They lose track of time.
They fixate
on something you said 10 prompts ago.
Like, it's crazy.
You really have to know what you're doing
and pay attention but having said all that they can be useful for certain tasks so the process
that i use is the context is king right excellent context the most most is most important you have
to write a super detailed spec so i'll show you i wrote like a 25 page spec before i give it to the
coding agent and i use like gemini or chat gpt to help me write the spec but i edit it make sure
sure it's exactly what I want.
Give it tons of previous examples.
So I give it, here's the
last five examples I did.
Here's the spec.
Here's the PowerPoint deck I showed the client.
Here's
the slides the client sent me.
Here's all my thoughts.
Here's the best practices presentation
I did at last month's Mindstone.
That's all related to the topic.
So giving all of that
context, giving screenshots for the UI, all of that is super important.
And then you can vibe
code the first version if you like vibe coding.
So my son makes fun of me.
He calls me the vibe
coder.
He works for Google.
So you vibe code the V1.
That's when you have a blank slate.
You have
no code.
You give it a good spec.
You go write me the code and then you spend time debugging it.
So
you work in that vibe mode.
You debug it to get whatever the AI decided to build because you give
it a big spec.
It's not going to go build at all.
It'll build something close to what you wanted or
part of what you wanted it decides what it feels like doing and then whatever it built edit work
with it to get that working like test it do your UAT and get whatever it built working to some
degree and then you know fix it one bug at a time at that point once you've got a code base
you got to narrowly work on one thing at a time have really narrow conversations
and then once you've got that first v1 working the what I always ask the thing to do is write
write a bunch of code, you know, documents to document the code base.
So, you know, draw code
flow diagrams, draw architecture diagrams, draw data models, document every single line of code,
and then read those documents so that you understand exactly what the AI agent has written,
like read through the documents.
I'll show you examples of the documents.
And then after that
point, you put it up in GitHub or wherever you source control it, then you work out one narrow
feature at a time and the documents are great context now that you've had it
describe what it built that's the context you can give it and you need to
you know reset your chat and then once you get there then you know you have to
remove code bloat like the code this stuff writes is really bloated and fat
and it finds these little narrow edge cases and builds all this diagnostic
stuff that none of which you need so you need to do a remove code bloat every
once in a while pass through the code.
So that's the process I typically use.
So
some examples of context I guess that RFP I talked about I gave it the RFP
request converted to PDF.
I gave it examples of previous RFPs I responded to.
I gave it a best practices deck and I said right I didn't use Gemini or I just
used actually cursor to do that and and it did you know did pretty
good.
With the good context it saved me time.
What would have taken me eight
hours.
I got done in like two hours because I gave it all in really good context.
Infrastructure's
code that I gave it an architecture diagram of the infrastructure I wanted to build.
So I want
you to build these VMs, this HD insights server.
I want these private nets, like this private
network, these virtual networks, these endpoints, these firewall rules, these NSGs.
So again, you
need to know what you're asking for because if you don't know anything I just said, you'll never be
able to build a Terraform infrastructures code project.
The metadata driven framework, this is where I gave it a 25 -page deck with my data warehouse
principles and best practices.
For CFD, I gave it the desired output of what the shape of the grid should be.
And then detailed tech specs, you can see in that there's like 25 pages, a markdown
file that I created that's about 1 ,000 lines long.
So that's the spec that I gave before it started coding.
This was an anti -gravity
UI UX context I gave it a screenshot of a login page.
That's what I want my login page to look like
So I said, please make that login page
That's what I wanted my screen to look like that the human being will review with all those fields are what it's
Grabbing out of the email.
It's showing the email on the right pane and
And then, you know, I said, go build that and map all these fields in the email to these fields in the document and a bunch of gory details.
So that's the UI UX context, and it's able to build a style sheet and do that and mimic it.
If you don't give it that, you know, who knows what it'll build, right?
The CFD code, my target, that was the grid.
So when you're doing computational fluids, you need to create a grid where at every single intersection of the lines,
it calculates all the fluid properties so it's a u -bend my thesis was to calculate the flow around
a u -bend from the picture on the right there you can see the flow separates and you get a little
vortex downstream of the bend so that's where you want lots of grid points near that vortex and you
want lots of grid points near the walls not so much in the middle not so much in the straight
part so I you know I built that for my thesis and I said that's what I want it to look like
I want 81 by 201 grid points and I want you to focus
the points at the walls and where the you know, we were that little vortex and separation bubble is so
That was a painful exercise.
That was eight hours of my life.
I'll never get back
So I started with it build me a simple thing build me a nice straight duct and around point 20
That's where the separation happens.
So cluster the points near there and cluster them near the wall.
So I was able to do that
build me a straight duct.
I was really happy with that.
Then I said, okay, now we're going to build
a radius.
Now the outer wall is going to be longer than the inner wall.
So I said, take that grid,
the exact same grid, but lay it so that the top wall is now longer.
It's the length of the radius.
Okay.
You see my thinking here.
Now I'm going to say, bend it like you're bending it around a bar.
And then it lost the plot on that moment.
That's the best I could get cursor to do.
And I spent
Spent eight hours, gory detailed descriptions, giving it examples, showing it its mistake,
and it kept on making the same mistakes over and over and over and over again.
No matter how hard I tried, I changed LLMs.
I tried every single LLM that Cursor had, and it just couldn't do this task.
And so these things are not intelligent by any stretch of the imagination.
imagination it's a sausage grinder that you need to know what you're doing and you need to know how
to prompt it and you need to know that there's a good base of information out there to be able to
to do this otherwise that's the shit you'll get but a lot of it is invisible because you can't
tell if you're looking at a code file the code file could be that shit but it just looks like
code and you don't know any better right so you need to be super careful with uh using this in
in situations that's why you need to understand what you're doing so tips you know you can buy
code that first first code is a blank slate you can debug that first version then ask the agents
to document when i ask the agents to document i go i want you to document every single file you
created tell me exactly what the purpose of that file is and then line by line on every single line
Tell me what that line of code is.
What is the function call?
What does it do?
What is that variable name?
Why did you create that variable?
Why did you choose that data type?
I said document it like I'm a moron and I know
Absolutely nothing right and I get it to literally if I have 10 ,000 lines of code
I'll have a
40 ,000 line document right and I say draw me code flow diagrams draw me how one function calls another out one file
calls another how the user interacts through the code show me that draw me
the data model an entity relationship data model and then I actually read all
that stuff you know you got to read it all in detail so that you know the code
base because you can't just blindly go through this right and then at that
point I say write me a requirements document so I gave it one but I said now
write me a requirements document that's consistent with what you built at the
the same level of detail that I gave you and write a technical spec of what you've implemented
at the same level of detail that you've created.
And it's great at documenting because it's got a
perfect example that I can mirror.
It's just mirroring the code you created.
And this document
is awesome.
So now what's that document?
It's the context for your next ask.
So now you reset your
chat, delete it.
It's never seen anything.
You start all over again.
You pick a new LLM and you
you go, read my context, my 50 ,000 line document, my tech spec, my requirements spec, my flow
diagrams, all of that stuff.
And that's a great start to do all of your chats.
So this is a sample
network diagram for my Terraform infrastructures code.
Amazing that it could draw this.
I just
said, draw me a picture of my infrastructure, and it barfed that out.
It's a little bit of
editing on the colors but pretty close that was great draw me a database er diagram drew me a nice
er diagram with relationships and many to one things and all the things i know so that was
pretty good saved me time to do that and uh again great context um draw me a code flow example so
you know there's the flow through the different modules and the code and did you click the button
or not click the button, I drew like 20 of those, so you know it's really good at
documenting code, way better than me, I'm like a lazy coder, all my employees used
to laugh at me because I would name variables like shit and crap and
then never clean those out and then sometimes it ended up in clients
client code, like one time I was doing something for Bank of Montreal, we're
building trade areas for branches and then if we couldn't find the branch name
I just put like, but F nowhere, and that showed up on one of the labels.
And the client called me and said, these are awesome, Gary, but where's this one branch?
I'm going, whoops, you shouldn't do that.
The AI agents don't do that, by the way, so that's maybe one improvement to my coding standards.
So read all your documents, right?
Make sure you understand them, right?
right?
Upload your project to GitHub at this point or whatever source controller you use.
And then
from that point forward, you micro -develop.
Micro -develop one little feature at a time to keep
you want to, as your code base grows, you have to really narrow the focus to keep it on one tiny
little thing at a time, right?
And, you know, do a branch, you know, branch your code so that you
can do a pull request later.
Again, provide, get the agents to read all the documents, get it to
to focus on certain parts of the code that you know you're adding the features to.
Again, provide
a super detailed description, as detailed as you can.
And then now ask the agent for a plan.
Before
they do anything, ask it for a plan.
Give me a detailed plan.
Cursor has ask mode, so it'll give
you a plan, show you the code it intends on implementing, because you don't want it to do,
because when you ask an LLM to do something, it won't just do what you want it to do.
It'll try
try to improve everything, even though you didn't want it.
It's like, the rest of my code is fine.
Don't frickin' touch that.
You know, like, just do what I ask.
They can't do that.
And I ask it.
I go, why do you change everything when I only asked you to do this?
Well, it's my job.
I try to improve everything on every pass through the code.
Okay, so this is why you can't just let it go free.
You need to ask for the plan, ask what it's doing,
and then if you want to let it vibe code at that point,
you know, buyer beware.
If not, you can implement the code yourself and check the plan that it's
valid, right?
So then do the pull request before you merge into the main branch, especially if
you have a team, you know, you need to do pull requests and review the code.
Because if you have
a team of developers, it gets scary because everybody prompts with a different level of
quality.
You get all kinds of garbage your code could turn into.
So doing the pull requests and
and code reviews before you merge it back
into your main branch is super critical.
And having meetings with all the developers
so you can learn from each other
to consistently prompt the same way,
to consistently write with the same coding standards.
You can give coding standards as a context
for the agents as well,
but I think doing code reviews
and these pull request reviews are critical
to get your whole development team on the same page, right?
And after every micro feature,
ask the agent to update all that documentation so that the documentation
is constantly consistent with with the code code you've done right and here's a
sample plan I asked it to write me a giant plan this was like a 30 page
document when I designed a data warehouse layer so it implemented it
gave me a big gigantic plan on step one writes beautiful nice prettily form out
out of documents, so better than I would write myself, so it's useful for that.
Okay, just time check?
Sure, almost done.
Keep the coding session short, so reset the chat
frequently, right?
Remember, every time you ask for something, the agents could do whatever the
heck they want, so reset that chat frequently.
If the agent gets stuck on a bug, which it frequently
does, it gets like that CFD example I showed, you could not get out of that loop, change LLMs if you
you can't figure it out yourself or figure it out yourself because even though you should be able to
it can't it gets it gets insane and then you'll get it in this infinite loop and when you have
coding teams way more way more complicated and you have to be super methodical sounds like software
development practices basically is what you need so conclusions good software development practices
no shortcut for that you need to be as methodical maybe more methodical than you were before
if you can't code it yourself forget it go do something else coding agents are like that senile
dementia page right they're sometimes lucid they sometimes remember they latch on to a crazy
concept that you said 10 chats ago they have no sense of time they do whatever the heck they want
even even though you didn't ask right and it's only good when there's a large corpus of examples
examples to build from, and very new and innovative stuff, forget it.
If you're an experienced programmer, you know how to create context and the agents, and
you give it, you know, direct it hard and push it hard, you can get stuff out of it.
I mean, I save maybe 50 % time.
If you're writing some bigger projects and it gets in a loop, it actually takes more
time than doing it yourself when you're doing complex stuff, and there's a grey area.
It's hard to know when you're transitioning from you're saving yourself time to you're taking more time
There's a big gray area in the middle there and but I like it.
It's like my dev team
I used to have a team of 20 developers.
I was a chief scientist of my company
I did all the inventing I invented all the patents and did all the hard stuff and I like doing the hard stuff and I'm
Lazy doing the boring stuff.
So give the boring stuff to the coding agents makes me happy feels like my dev team of old
I would explain something, they would mess it up, not because they were bad, but because I didn't explain it adequately, which is exactly what happens with the AI.
So I find it's just like a faster life cycle with my dev team, and it still takes a dozen iterations to get it done.
And it all comes down to how well you communicate and ask for the stuff.
So hopefully you found that helpful.
That's my experience with these agents.
I'm going to continue to use them.
I find that it satisfies my need to be lazy and not write boring code
and lets me work on the hard stuff.
And I can multitask and do like two or three or ten projects at the same time.
So happy to answer questions.
I'm going to hang around until 8 o 'clock, so if you want to chat,
I'll hang out in the hallway for a little bit.
But I'll stop here.
Thank you.
Any questions?
Anybody?