Hi, my name is Robert, and I'm going to start off this presentation by asking you a question.
Can anybody tell me the difference between an LLM and an AI agent? I'll give you five seconds to think about it, and I also created a hint for you.
All right, we have one person who would like to guess. Please go ahead, Andre.
An LLM is the brain of the thing while agent is more like the secretary of the thing. It performs intermediate actions but... Ah, action, that's the word. So an LLM produces a text output while an AI agent takes action.
I like to think of an LLM as a brain and an AI agent as a brain put inside of a body so that that body can interact with the world.
So AI agent enthusiasts are currently speculating what AI agents will be able to do tomorrow. Will they be able to build a video game or perhaps even run a company? Actually, they can do all of these things today.
The problem is that we are always asking a singular AI agent to do these things. While in reality, we should be asking a whole workforce of AI agents to do these things.
A paper that came out exactly two months ago called More Agents is All You Need shows that as you increase the number of AI agents, your output's quality also increases. For example, using just one GPT-4 gets you around 70% accuracy, while using five can get you above 80%. On top of that, if you have a much more complicated task that you need to be done, you're going to need 50 or even more agents.
Here's another research paper, this one from MIT, showing that when you have multiple agents debating together rather than just using one, then that produces a higher quality output. And also this paper that came out from some researchers in China. 1So now some researchers are actually showing how you can optimize AI agents as they work together. So not just optimizing one AI agents, but optimizing the workflow of multiple.
Unfortunately, this is really, really complicated. And if we want to capture the benefits of AI agents, how are we supposed to do that when we're not some techno whiz kids who can write 10,000 lines of code that implements all of this research to complete our tasks?
Well, what we actually need is some sort of interface where we can manage multiple AI agents as they collaborate together to build our video game or perhaps even run our company. So this is what my friends and I have been working on. It is called Chatforce. Chatforce allows you to build and lead a workforce of AI agents to build a complete software product for you.
And I will be showing you how it works.
So this is a build of Chatforce that we did release publicly, and it comes with one AI agent workforce.
So in Chatforce, you can build your own AI agent workforce, but today we will be showing you how to use a pre-made AI agent workforce called the Video Game Studio to recreate a video game. Specifically, we will be recreating the video game Agario, which is a game where you play as a cell that tries to eat smaller cells while not being eaten by larger cells. Let's use this workforce.
So I'm just going to give my project a name. Let's call it Agario Mind Stone. And I'm going to just save it in this folder. And great, here's my workforce.
It's actually pretty basic. We have a CEO, a CPO, a CTO, software engineer, a code reviewer, a painter, and finally a human tester. So I like to think of this as my own little virtual company of AI agents.
And what we're going to do is press play.
And it looks like the CEO is asking us a question. Here's our CEO. I call him Jeff Bozos.
And he is asking me to describe what I would like to create. What kind of game would I like to create?
Well, I actually pre-made a description of what Agario should look like. And specifically, I am going to describe just a very, very, very basic version of Agario. Later on, we will be able to add on additional features onto Agario to build the whole game.
It is, I need you to recreate the Gario game. The player is a circle of a randomly chosen color with a four pixel black outline. The player's radius is 50 pixels. The window size is 800 by 800.
The player moves by holding WASD. And a few more descriptive sentences.
This means that there is a conversation currently happening between the CEO and the CPO. Let's watch the conversation.
So here's our CEO. He's talking to the CPO, and he's asking the CPO to choose a modality. For example, are we trying to build a web-based game, or is this going to be a desktop application? And the CPO gives some reasoning and says that it should be a desktop application.
And eventually they submit their final answer and the CEO approves. Now we're going to move on to the next step.
It's called language choose. This is a conversation between the CEO and the CTO where they are talking about which programming language this should be written in. In our case, it looks like the CTO chose Python, and the CEO proves with an answer.
Python is a pretty good choice. It's great at rapidly building little games like Agario.
And now we are in the coding step, where the CTO is talking to the software engineer. And they will be writing the first pieces of code.
If we actually look at the text here, I'm not sure if you can read it from all the way back there. But if you can't, I will describe it.
Basically, the software engineer is generating the first pieces of code for a game. They generate the main.py file, and the CTO says, great, please continue generating more.
Then the software engineer starts generating the next file, in this case, player.py. And they keep on generating the files over and over again until there's nothing left to generate.
And now we go back and forth between the software engineer and the code reviewer. There, each time they talk, the code reviewer finds a problem with the code, and the software engineer fixes it.
So for example, in this case, our code reviewer, Bug Squashington, found a problem with the addSmallCircle function. Who knew that there was a bug there? And then the software engineer, Bjorn Shoestrap, he fixes the problem.
And it looks like there are no more problems because they finished talking to each other. Now the software engineer talks to the painter just in case there are some graphics that need to be generated.
And finally, we get to place the human tester. So the human testers, we place the human tester and we get to try out the code ourselves and see if there is a problem with it. Or if there are no problems with it, we get to request additional features.
So let's try out the code. So I'm going to open a new terminal window in the folder where our Python code has been generated. And I'm going to run the code.
And here is our first version of Vigario. I'm currently moving the circle with my WASD keys. And I'm going to try and eat this pellet here.
And it looks like I consumed it. However, this is really, really basic. For example, there are no enemy characters that I can eat. And it looks like only one pellet spawned.
There's no grid. There are no name tags. So if we go back into Chatforce, we can give that feedback. We can say, OK, great. This worked. Now let's add on an additional feature, such as the grid.
So if I say, please create a grid for me where the lines intersect every 50 pixels, Then I submit that feedback and then the human tester talks with the software engineer and they keep on generating the code over and over again and fixing the bugs. 1Then they ask for more feedback and we can keep on adding features that way.
Now, here's something that I created in 20 minutes. This is a version of Vigario where I gave 10 different pieces of feedback over and over again. Let's see what this looks like. There we go.
We have a grid, we get to move a little bit faster thankfully, and oh look, there's an enemy character. Let's try and eat them, let's see what happens. And look, I ate them and I got much larger.
So this took me 20 minutes to create, it costed me $2 to create, and I was able to create a fully functioning game. Isn't that neat?
So if you are interested in perhaps building your own workforce, please speak to either myself, I'm Robert, or my two co-founders here, Maisha and Peter, if you guys can stand. Yeah, so during the networking session, feel free to speak to us if you're interested in building your own workforce.
We plan on also releasing a marketplace feature so that if you want to create your own workforce for your use case and then sell it on a marketplace to not just people, but even businesses and make some money, please speak to us at the end of the event.
And that's it. Thanks for listening.