So hello everyone, so I'm going to present to you a simple idea here is from idea to video
creative workflows with generative AI.
So a picture speaks a thousand words so I'm just
going to show you two videos or less than one minute and then I will explain how I make them.
first one and then second one okay so presentation so how do I make this kind
of video so here my image to video workflow why is it named like that
because with generative AI when it comes to videos is two ways of making them
with gene AI so the first of all the first of them is text to video so like
the regular way of using an AI to generate content or text you're writing
what you want for example I want a dog to do something on the video or some
stuff like that and the second way of doing it actually I buy this is the
thing I prefer because you have more creative control over what you're doing
is image to video and actually you're putting the image to act as a prompt
Sometimes you add some text, but the image in itself can be a prompt.
And overall, you have more creative control over what you're doing.
And for a creative person or a creative executioner, it's better.
So what I'm doing is that, first of all, as you can see, this is the logo of Claude.
But you can use any LLM.
You can use Gemini.
You can use GPT.
And what I'm doing is really simple.
It's just kind of difficult for people who are not used to LLM, but I just say to them,
OK, I have an idea.
I want to make a video with music in mind, precise music.
I have images that come in my head and I want to make a small musical clip.
so I need your help to do the prompt in Generative AI tool for images so I
use the help of Claude or Gemini to do the prompt in MeJourney so what
MeJourney is for people who don't know is a Generative AI for images so now you
can do videos nowadays but in MeJourney videos are not that good but
but images are really what stand apart.
Why I'm using Medjourney, because it has a special,
I would say, aesthetics that you cannot find
on other text -to -image model.
For example, Nanobanana is really good.
Like it's really a good model to create images,
but it has less of a special aesthetics
and Medjourney is quite original.
For example, Medjourney is a community, so I can pick any images I found on the community,
so the other creators, and I can inspire myself from what the other people are doing and are making on the Medjourney community.
Also, what I can do is when I pick a video, I can actually rerun the same image.
I can rerun the same image, I can even edit the image right in my journey, I can
change, like you can see it's vary here, so I can make a small variation of the
image, just rerun the same prompt.
I can outzoom the image, which means that
there will be more elements in the next generation.
And the most important
things for what I'm showing you is here so it for example like I says is a
really special Majorini aesthetics and it's and you can actually personalize
the stylization so the major knee I will say filter in all your images so you can
decrease it but the more you increase this this this tool the more the image
image will seem original, like it will have a special taste to it that you cannot find
on other models for images.
1And the second most important stuff is style reference.
As I say, I create my prompts on the LLM of my choice, and then I go back and forth with
it, like I tell him, OK, your prompt was good, but I need to change this, this, this, and
this because this generation was not really what I intended and then from the moment I have an image
that I really like and I'll be like okay this is a scene and this is a shot that I will put on the
final form of the video I can take it and use it as a style reference which means that there will
be a consistency on all the other images and all the other scenes that I will choose for my
from my final video so here this was my journey and then the next step is so I'm
not gonna dive into detail about that but you can see the Photoshop logo so
for Photoshop what I'm doing is that I'm just doing the editing which means I'm
removing some stuff that I don't like about the images for example there could
be some hallucinations that I'm not a big fan of so I get rid of it with
Photoshop is you can use any editing software for this kind of task and then
I'm using another tool and define the big final one I would say it's Flora so
what Flora is and I'm gonna show you me recording my screen and show you this
platform so Flora is a platform when you can use any model that exists it could
be text image and video so any type of generative more ai model like that and you can actually uh
have everything in the same place uh mix some images together like you can you can do whatever
you want and you don't have to change uh a tool and uh to um to yeah to do too many changes
because everything is on the same platform and what i'm doing here for example i'm showing you
that on this platform you can actually do the editing so the in painting i'm choosing a zone
that i want to get rid of on the images here for example it's a snowy mountain top i've
chosen the nano banana so it will come i i say okay i remove the snowy mountain top i select
the zone i say i select nano banana pro in painting because it's the best model for this
kind of task.
Then it got rid of it.
After that I have my image and I can do the upscale part.
So what is the upscale?
The upscale is, so I'm using a generative AI tool which is named
Manific which is the best tool for that.
By upscaling I mean that it adds more definition
to the image because the problem with Gen -AI images is that sometimes there's a blur to it
and it lacks definition and it lacks details and with this kind of tool like it adds some details
I recorded my screen the quality is not that good but actually when you're when you when you're not
making that kind of steps when you're doing a video it's quite important because it really had
a definition and details to your shots, to your scene.
And then I'm using the best model from my point of view
for video, which is VEO3, the model of Google,
to generate one of my scene.
So it's fast forward because it's quite long,
like it takes four minutes to generate eight seconds
of video, that's one of the flaws with generic video.
But then I have a shot that can be used
on my final final production and I didn't write any prompt but but for some
cases and for some images it's actually necessary to do to write a prompt so for
example here the image that you see is a statue and if I upload the image and say
okay there's no problem to just do a video with that what's gonna happen is
is that VEO3 is going to think that it's a character,
so it's going to move, maybe it's going to speak,
and I don't want that,
and the model cannot understand that it's a statue.
So for this case, for this kind of images,
I'm writing the prompt.
So what I'm doing is that I'm opening a window
where I use a Cloud Opus,
and I'm writing the video, the prompt for the video model.
Then I'm opening another window in order to have a prompt
that is ready to be used right away.
Then when it's done, so it's fast motion,
up, copy, paste it.
and then I open the window for the video step, I'm choosing VEO3 and then
I'm writing the prompt and then for this kind of images I have an image that is
ready to be used, a video that is ready to be used, a scene that is on point.
So then, now that it's done, the next step is editing.
So I'm not gonna dive into
video editing, but what's happening is I just take all the scene together.
I'm not
quite using quite fond of video editing, so I'm just I'm just using I'm just
using I'm just using CapCut because it's a really easy video editing software.
I'm
and for the text you've seen, it's just regular design.
Like I'm choosing a font on the internet
that suits the aesthetics that I have on my videos.
So that was the MIM image to video workflow
in this entirety.
So the cost of all that.
So I'm using, I have this subscription for LLM.
There's a subscription for Midjourney
and also important why I'm using Midjourney apart from like you've seen with Flora is that
Midjourney is really a sovereign model which means that Midjourney don't want this kind of
platform to use their model so you have to use these things apart.
This is the editing software,
I put like some parenthesis between because editing software like I'm using Photoshop but
is quite expensive and you can use any editing software like there's a tons of uh different
software that exists for that there's the canva not based web app so it's flora um i'm
it's roughly fifty dollar amounts you have fifty thousand uh fifty thousand credits on flora
what i'm saying why i'm saying that is because every generation cost credits which means that
for example, if you want to generate a text on Flora, it's 30 credits, and if you want to generate
a video, it's 1 ,500 credits.
So what's happening is that if you, and that's really the important
part of my presentation, is that with GenAI, it's really trial and error, trial and error,
trial and error.
So sometimes you may not have what you want, even with the prompt that you
think is on point so you might have to rerun the same generation on the same prompt and what's
going to happen is that a regular scene of eight seconds is going to cost you like 4 000 credits
so it it goes pretty fast and that's why i say that way higher uh it's where the cost is way
higher if your videos are over one minute and a half or if you're doing like many videos like
every week or every month, the cost is not going to be the same but at the
moment that is the price I'm paying per month.
To go beyond, so Flora is really an
amazing tool but it exists, there's tons of alternatives to Flora that exist
there's XFIL, Open Art, Artlist, LCX, Crea, Confuia, I did not test all of them
because what I want to do is to create stuff, is to create my videos so I'm not
testing each and every one of them.
I tested some of them.
Some of them have really cool
features compared to Flora, but it's just to tell you that there's really a lot of competitors in
that kind of field.
And for example, on Xfield, there's no problem that's being involved in this
process.
You just have to get on a website, you upload one of your images.
So this is a different
project, but this is an image I've uploaded on Xfield.
And then you get nine shots of different
angles of the same images just with Gen -AI and there is no prompt being
involved.
So just to show you the amount of I would say possibilities and the
amount of things you can do with this kind of web app and this kind of tools.
So my advice on this kind of matter and my conclusion is that there's really a
lot of trial and error and for example if you have people in your company or if
you work with people who are doing ads, who are doing creative stuff, who are doing videos
and who are using Gen -AI, you should give them some time to actually master the tools
because it's not even mastery, it's just getting your hands on that kind of stuff.
And so it takes some time, and you have to have time at your disposal, and my real conclusion
is that at the edge of Generative AI, you can actually do not anything you want,
but when it comes to creative projects, websites, anything, like it's really like
I'm not used to doing like videos, but now I can really create my own ideas and
if you have this kind of ideas, these kind of creative ideas in your mind, you can
actually do them now.
So thank you for your attention and now I'm open for
questions i have one first of all amazing stuff uh great videos i really love it the just for my
understanding the reason why you use photoshop because you could do theoretically the this
editing as well in flora are you using it because you want to spare some credits or is there
i didn't say it but basically it's more that it's just that photoshop isn't just
more precise.
There's more precision to it, so that's why I'm using Photoshop.
But for small modifications like on Flora, the NanoBanana Pro in
painting is quite good.
It's just that I want precision on some of my works, so
that's why I'm using Photoshop.
But as I say, there's tons of different editing
software for this kind of task.
Could you do the editing as well in
mid journey or why is the reason like when do you like i say my journey is really like it's really
good to generate stuff when it comes to editing it's not bad but it's not as precise as photoshop
and i'm used to photoshop like i've been using photoshop for four or five years now the generative
i feel when you can modify anything you want so i'm used to it but you can do stuff on my journey
But for the videos, Majorini is not that good, like it's really not a good model.
So how many videos are you making with this budget of 80, 70 to 80, 80 to 90?
Well, this project is more of a recreational, so I don't have a pace with my production,
my content production at the moment I've made three four of them so it's kind of
new for me but I have plenty of ideas I have many things I want to do even
longer project I have I have even fun projects on on Ben's I really enjoy
video games stuff like that and will take more time but at the moment I've
just made like three to four videos but I'm used to GNI but more for like for
two and two to three years now I've been more used to static stuff like images
like carousel on Instagram stuff like that I'm more used to that like video is
quite new for me in in some sense
So your professional background is in this field?
Yes.
Originally I'm a communication officer and specialize in design and content, like writing
content and stuff like that.
And it's been approximately two to three years now that I have a gen AI consultant.
I act as a gen AI consultant in some companies, which means that, for example, there's companies
companies which is called Groupe Seb in France.
I've worked with them on very
precise stuff like for example there was when they're
releasing products they're doing you know pans, cooking robots, stuff like that
and actually there's recipes that come with it.
There's blog articles with recipes
and there was really all photos like there were like 10 years,
made 10 years ago and they didn't want to pay photographs,
they didn't want to pay text time for that.
They just want to keep the images and the recipes,
the photo recipes up to date.
So my role was basically to cut the cost and introduce Generative
AI to that and I've used my journey for that with what you've seen
with the style references, stuff like that.
So that's one of the example of things that I do on the professional field, many more of them, but that's a good example.
I like to give it because it's one of the best to give.
So you're really using it for your professional work?
Yes, I don't come to a very different field, but it's just that when you're a creative person, especially in creative jobs,
like since I would say even like five to ten years ago and even prior to that you
you were specialized in something like you were used to images or you were a
video editing person or like there was really like specialized stuff and
role and now it's more I would say it's more shady like you can you can actually
actually go back and forth between the different wall, but if you're an editing person, and
for example, for me, if you're working on the, for example, in cinema, you're working
in movies, you're doing video editing stuff, now you can even do movies on yourself, basically.
This is a really unbelievable example of people who are doing movies now, like 10 to 15 minutes
episode of things entirely with generative AI.
Is it good or is it bad?
That's another subject because there's another content now and a lot of it is kind of slopped,
like there's no soul to it.
But anything is possible, like I said in my conclusion.
More things are becoming possible.
I'm really curious about, from the last given what Reggie was saying, is it possible
to share what your take was when you took a kick in it?
What, my background?
No, no, the…
You just asked last time.
Yeah.
Your opinion.
Ah, that's…
Awesome.
Yeah, it's really a different matter.
Like, I couldn't do an entire presentation on that.
It would be too long.
No, it was just like, I don't want to spoil because, but after my, after I'm done with my presentation, what's going to happen is that Reginald is going to ask you questions about a topic with AI, it's going to be a debate, like, kind of, and I just made an introversion, and it's what kind of a tunnel, so I was really deep into it, because I'm interested in AI as a whole and its implication,
So, that's how it happens basically, not diving too much detail, but that's what happened.
So, can I bring back?
Come back into it.
I guess I could ask you one question is what surprised me the most based on the conversation
we had was that I discovered what you were doing, because you're doing this also in your
spare time.
How, you were talking about the evolution of the job for creative, like I always want
you know it's really interesting because i'm asking myself the question like almost every day
almost every day like there's a new release like this cloud co -worker or anything that happens and
i'm like okay well well maybe i'm gonna be your cook because i like cooking so uh stuff like that
but at the same time, like we had a conversation about it,
is that you think things are going fast and then you're talking
to regular people and you're like,
okay, so they don't even know what GPT is, okay, fine.
So, this is the human world, this is social medias,
this is trends, this is professionals as well,
but one thing's for sure.
Reggie.
One thing for sure is that we can see that teams are being reduced,
like there's less juniors than before.
And I'm kind of still a junior a bit,
so I was concerned.
I was in that problem.
I was deep in that.
So it's really a complicated matter,
and I have different ways of seeing things, but it's an everyday questioning, almost.
One more question.
The first video that you showed, how much time did it take you to start from the first
idea and the first image or the first prompt on cloud to having this video?
because in here it looks like it's so seamless and it looks so easy but how
much time did you actually spend that's I didn't dive into that because that's
quite difficult to answer but for example okay
maybe you're marketing campaign specialist for example you want to do a
video for Facebook ads or Instagram stuff like that and you have a very
precise brief you know what you want to generate you know what the picture will
look like you know the colors you know everything and you and you already dive
into that kind of tools I would say that a short video like one minute can be
made in a day of work okay I would say a day and a half like it depends but if
your brief is really clear and the video is not that long because if it's three
five minutes this is not the same thing but one minute maybe but your brief has
to be super clear and sometimes like oh well I don't like the idea so I'm
changing my way when you're doing creative stuff for that that happens a
lot like compared to right now
Yeah, so imagine you made this video with AI, you would sell it for a price, and if you would have done it in the old ways, how much would have you sold for?
need like four different jobs to do that kind of thing.
You will need an animator, you will
need someone to draw, you need someone to do the sketch, the storyboard, all that stuff.
And so it will be way more expensive than what it is right now.
I don't even have the
pricing, but it will be way, way higher.
Definitely.
in the video world you price sometimes in seconds so you know 90 seconds so it could be
depending on when you add 3d elements and that there was a there was a time they used to tell me
that it could be 300 francs a second because of the amount of people because what's funny is that
like you know when you said it's eight hours to do this like for example it would take me eight
eight hours to figure out how to open Photoshop, put something in it, figure out the tools,
and do that one thing that you did in one second.
That's what I said for people who are used to the tools.
Yeah, exactly.
That's for people who are used to the tools because there's a lot of trying.
And like I said previously, like sometimes a video, Flow Hour, like I said, works on credits.
So 8 seconds is 8 ,500 credits, but the first generation,
sometimes it happens, but most of the time it's not on point,
like I need a second one.
So actually 3 ,000 credits.
So it's way cheaper than 10 to 15 years ago, but it's not free,
and the generation takes time.
Like you've seen, it's almost 5 to 10 minutes to generate stuff,
and if you want to generate an entire movie,
with this it's another matter yeah but that's what i say previously is that now there's more
content than ever but good content that's another matter because there's like i said there's a lot
of uh slop like ai slop like you're you're you're scrolling and you you dive into a video and it's
this is just a dog fighting a cat and like what's the what's the purpose of it like it's funny
like it's funny but what's the what's the intent what's the goal what I'm doing
musical clip with with music I enjoy and that touch me touches me and I have even
things planned like I I really really love metal and I have even planning on
2d you doing video clip of metal music stuff like that because I really enjoy
that and it touches me but there's some content you dive into it and you're like
okay so it's just a cat fighting a dog and that's all.
And like it's funny but
that's all.
So there's never been that much content but good content in other
manner.
There's also another challenge is that what he said about he was specific about
something he wanted to edit in one of his images.
Depending on who you work for, if you work in an agency
environment the bigger the brand the more demanding the coin so even if they
have an understanding that oh we think it should be cheaper because you're
using AI the person in the communication department or in the brand management
brand strategy department or in wherever levels because they're levels right
there's going to look at a video and then have expectations that if you've
ever tried whoever tried to generate an image on charge of it okay whoever tried
them to edit one thing.
One thing on that image.
The last time I was so frustrated by
something it was Spotify free.
That was the one time I looked at something and I was like,
why did you change the whole concept?
I said take out the S.
And I tried for a milestone.
For example, I was doing these posts for LinkedIn to find speakers.
And it was great.
I was
I was just like, just put the logo in.
And I put the logo and the, you know, attachment.
That level of frustration is what people don't understand
first of all they're paying for with professionals.
And even when they understand it,
the person who's paying it is rarely the one
that's actually working with the professional.
So I think the notion, I think the hardest thing,
and that's the reason I wanted your talk,
is that for me the hardest topic about AI
is how we value things.
And having been in service for 20 years,
I've been selling services as a marketing manager for years.
Most non -designers, non -developers, non -relating with service providers, they don't value service.
In the sense that they value that they're not doing it.
They value that they're paying for an output.
But I'll be honest, for me there's very little difference between when you go to McDonald's,
you're on the screen, you choose something, you pay for it, and you wait at the counter.
it's sometimes the same thing because the expectation is it should be easy
now and I think that's something that's a real debate and culturally I think
even in Geneva it's an even bigger debate it's very hard to justify your
expertise and get someone to pay for it if you're not like oh but I work for
Nike and I work for UBS so that's also why I wanted not just your opinion but
also for you to show you do new spirits aren't because artistically he's not
valuing this if unless someone says oh I want to buy it and I want the full
version he's doing it the passion like the the variances I do it in my own time
unless someone commissioned him to do it different content this is the best of
both world because at the same time it it allows me to keep like keep up to
date about what's happening on the genii fields when I do stuff like that so it's
It's not innocent in some way.
I'm doing it because it serves me.
But at the end of the day, at the beginning, I'm doing it because I have an idea.
I have an idea that comes and I want to release it.
I want to release it.
I want to launch it.
So that's the main reason.
Just one last question, I think, back there.
You wanted to ask something.
I was curious about why are you using Cloud Opus?
It seems to me that the context was a small because there are -
Okay, so I didn't dive into that because basically there's a structure
that is really good for video generation model.
So it's in three words.
I don't know how to say it in English.
but it's anchorage, dynamique, cinématique.
Anchorage is what's in the picture,
dynamism is what's happening, like what's the movement,
and cinematic is the kind of, like the camera,
is there some blurs to it, is it photorealistic,
stuff like that, and that's the best structure
at the moment for GNI Video Model.
I didn't dive into that, but I learned it on myself,
like on YouTube, okay, what's the best way to do image to video, and then, like, I learned a lot of things, like, on myself, like, there's no courses, or just YouTube, and trying, trying, trying, trying, trial and error, until I get something I want.
Everyone, a big round of applause for Luca.