Hey guys, hope people can hear me. I'm having a bad throat, not the right day for it.
Okay, my session is how I use AI to build Voibe. I have a small deck. I thought I'll talk a few things before we dive into the demo, though there have been warnings.
A little bit about me. I'm a 3x founder third time building Voibe now in the past life I was running tech teams at some of the hottest startups in India and one in UK Tom called himself a geek I like to call myself a techie who likes building stuff I am currently
building Voibe Voibe is a private AI voice dictation app for Mac that works works everywhere means it is a simple UX where you just hold speak release and whatever you spoke gets typed out in the editor you are on.
We were talking about privacy so Voibe runs on the device I think yeah just to add on to Emmanuel's point I think privacy has to be by design that is one easy way of achieving it not trusting yourself and not the system yeah.
I do not like the word dictation because I feel the world we are entering voice is going to be the primary way we are going to talk to AI and that is why we have this vision of killing the keyboard.
Before we go into the demo I thought I will quickly talk about my AI stack. I think we all have our own stack and what I understand is the stack is shrinking day by day as ah fewer tools are able to do more ah I use Gemini and cloud interchangeably
as the the the primary agent for all the miscellaneous users like ah research analysis image creation etcetera and similarly anti gravity and cursor for all the coding ah tasks ah because I keep running out of uh the tokens uh for opus uh yeah i still don't want to pay for uh pay 200 yeah and
Voibe of course uh i don't want to type on my keyboard and canva is like the miscellaneous uh design tool it sort of it's it's not a ai tool per se but it has a lot of hidden ai which is very useful to uh edit images create slides etc okay uh too much talking now let's
going to the demo so i thought um before so this is my website and i have a blog section where we have a bunch of blogs today i thought with you guys we'll add one more blog to my website that's
what i uh i'll show you so the way i go about doing that is like so i have my seo team gives me content in a markdown file like this so I'm going to create a article for this post
all I have to do is now I'm going to talk to my Mac so please bear with me I want to create a new blog article for this markdown file refer the attached Astro for design and the coding conventions we follow while creating a blog in our website also refer the attached article for the images and use them appropriately in the new article that you are going to create refer the attached article also for the schema and build the right schema for the new article you are going to create what I'm
trying to show here is all that the AI needs today is the content and a sample reference so the file that i was referring to is my previous blog article that that is already live so by giving this as a reference and giving a content uh it's a fairly big article so it will take some time um uh it is able to uh it is able to build a fully polished article and usually in first go, let us see how it goes today without issues.
This is like a 12000 word article, it will take some time to understand and there are a few interesting things that will happen as part of this that it look for the images in the article, look into my repo, resolve all of that etc.
So and previously this used to take at least 2 -3 atoms and as the models are progressing now it's like a single shot and it works beautifully in the in the first go let's see how it goes okay I know this is going to take time so I thought I'll
show you guys one more not a demo more of a process thing that I follow which helped me when building especially back -end applications yeah I know a lot of you guys said use a but not use it for I don't know how many guys use AI for coding, I don't know how many engineers in the room, okay great, lot of engineers
but okay, so this is another, this is my backend repo and this is anti -gravity the coding agent, so I quickly wanted to highlight few things here, okay, okay it has started building out the the new the new file here okay let it do and I'll quickly show you guys other thing so back
end so this is a feature which I recently built and the process I follow is this that I start with this usually long prompt where I explain what I want to achieve and by defining as much clearly as possible and the first step is usually creating a plan and so this is that prompt to build a plan for my feature the feature is to set up an
email sequence for my users who are on trial to nudge them to become pay paid users yeah so and it's a fairly complex feature because there are these rules tools for defining emails and I am also using the same task to define the content in the email etc.
So, so that is why the prompt is fairly long let us keep all of that the point is if once this is given it it built this implementation plan what what is really nice about anti -gravity is like we can interact with the document like how we will usually do on a PRD or tech spec giving feedback to the engineer to go build it so it works the same way so I've because this
is a feature that's already built I've already given a lot of comments and so these are the decisions that we both agreed on before building this feature and then it proposed the plan which which involved creating all this html files and also sequels etc to build the project.
So then it goes on building it is a fairly long project. So it went on a few iterations not in the first shot.
Then the last thing I do is after testing I create another developer facing manual so So that next time an engineer or me looks at it, we know not just what was built, we also know why it was built and how it was built.
So that because any feature as we know, do not stop on that day, we will continue to evolve. So this way, we have a reference of what exactly was built when it is fresh in its memory. memory, this way the context is retained and we produce a good documentation that we can later use to test and also to build on top of this or to change it etc, okay.
Okay, this is still building, I think the other learning I want us to take is today these tools are pretty good that even if you give an image of a website you want your content to look like, it can simply build your page like that.
so this is just trying to show that by giving an actual HTML page it's even an even with an image today see I can build such beautiful pages okay so my demo is done it's just that yeah it's taking time yeah
what was really interesting perhaps for the non -technical people is you're referencing a few frameworks and templates and can you maybe talk and you you also had that in anti -gravity how you gave the feature and it was accessing a bunch of frameworks and templates.
Could you maybe contrast like the output differences between you know just feeding something simple versus like what you have done with your frameworks for people who are perhaps less.
So yeah if it is a simple feature we can dive into coding loading in the first step because it is a fairly complex one that is why this 3 step process was needed.
More than a template this was a plan that the AI created I sometimes save the plan for future references if the project becomes too large to be finished in a single context window do that's why a plan is again useful so that you can save the plan you can execute parts of it come back to it and iteratively build the project yeah what was it useful yeah yeah just really
interesting seeing how you set up kind of like specific things like your email templates and things like that which then enable the agents to build according to your preferences or your taste
that's correct yeah yeah i think uh because this is like an existing project uh um and we were building on top of an existing feature i think greenfield is one thing uh building on top of existing uh code base is always a tricky thing so and that's why i always start with hey understand uh the current system build things uh in line with our current uh practices and framework so So it automatically refers and build it in a consistent way following the design, following the naming conventions, folder structures, everything yeah yeah I think yeah this is
also an important role that we have to do to guide the AI in in the right direction it's it's too smart but it's also too directionless if for the lack of a better word yeah so our job is to set the goal and guide it in the direct right direction to get what we want and how we want yeah otherwise it'll always produce an output it cannot be the output that we want okay maybe I should have chosen
on a smaller blog yeah yeah yeah yeah we can talk yeah I think it's mostly done yeah we can we can discuss questions yeah
one question I have is how many free and paid users do you have and what's the main feature they're after when they're choosing Voibe versus Whisperflow yeah so
the big difference is Voibe runs on the device so it's private and it's fast and it works offline so yeah people are choosing for privacy and speed yeah
Okay, sorry maybe I was doing two things in parallel. So this is my front end, Voibe website, this is my front end. This is where we started the demo where I am creating this blog. I think the blog is done now.
So this is a website instead of HTML I use Astro as the framework. So all we are doing here is I have a bunch of blogs already existing and we are creating a new blog here that is the task and the input to this task is a markdown file.
This is the raw raw text data that has to go into the HTML. So my SEO team gives the text file and I am using my previous blog as a reference and built a new blog.
So, it's already, it does. Just to explain, like a markdown file is just text, it's formatted in a way that like the agents can understand really well, it's really simple. And when you're talking about putting it into HTML, it's then like looking at like how you
build the web pages. So, and when you're talking about the SEO team, you're presumably talking about the agents that have collated a bunch of. It's humans as of now.
Yeah, so yeah, I think yeah, so this is my local environment where it is current, yeah, so I think yeah, it also it created the blog and also linked it appropriately in my footer. so this is the new blog it created following the same design and aesthetics we had for the the previous blog and i believe it has also referred the right images and um resolved it yeah
yeah so then what i usually do is uh to make sure because this is a blog and seo team is also very particular about they don't want to miss um a single word uh i also do one more step where i review the content that we just created so what i do is so wherever you build in here yeah can it
be program that you can integrate within a device like a wearable for example of a watch or you know
Just to understand the question are you saying can we build can we write code in the in the variable or can this application be deployed in a variable? Yes.
Okay yeah yeah depending on the application whatever we want to build yeah the coding agent is always the same. So this is cursor you can build a front end or a back end or a Mac application so I have three applications this is the front end this is the back end and there is also like a mac
application which is the actual app running on the device now of course i can build an app for um wearable if they yeah open open it up yeah yeah i've known you obviously for just over a year and
i think your story is really interesting so what's been the biggest challenge and so the best thing about what you've done building volume so what's been the positive and negative
positive are many I am so that's why I said past life because post chat chat GPT everything is different I'm learning to code in a new way with all these agents where my input is only this English words I'm not writing the code myself it's AI writing the code but plus I'm learning the full stack of building a business that's that's the nice thing like I'm doing customer
customer support, do I am doing marketing all this yeah and challenges the classic challenges of building a business right finding I think at this stage the biggest challenge is acquiring the right users and finding the right channels to talk about the right users is currently currently my big challenge yeah that's yeah that's right yeah so yeah I started I started
building a voice agent like a lot of people in the room are and then as I was building my first startup as I was doing wipe coding I felt the need for I felt keyboard was a limitation and that's how I started building wipe to increase my speed and that's how I realized okay it's it's not just the speed that we get
by talking to AI it's also clarity when we talk we say things differently which is closer to the thought than when we were typing so that was the biggest unlock and that's what people who have used dictation app like Voibe also felt.