My name is Matteo Marchione, I'm the IT lead at P &O Innovation, I'm here with my colleague
Silvia and we have prepared for you this presentation, Exploring AI in Action, a real -world R &D user
journey using WSBI, which basically means we are going to see how we did embed AI into
our platform from an end -user perspective.
And a few words about the platform itself, WSBI again.
again.
We started developing WSBI in 2015 by collecting together several sources and
providing a unique point of access for information which are relevant for open innovation.
So
we have funding opportunities, funded projects, academic papers, patents, collaborations,
and the idea was again to provide a single point of access for things like think about
a competitor analysis maybe you want to run a competitor analysis and you are
interested in a specific field so you have a searching engine and you can use
keywords to create the database then you get all the academic papers patents
which relate to a topic and so you can identify which are your competitors and
benchmark and see how do you position with respect to them so it's not just
about ingesting the data it's also about merging the data and creating connection
and on top of that we have then built an analytics platform which basically can
be seen as a data visualization layer so the idea is when you query the database
that you not only get the items but also aggregated information so you can have
like a map that shows you where the projects are mainly worked and topics
classification, trends, basically ready to be exploited information if you want to
run tasks like a state -of -the -art report.
Let's say you want to build a
state -of -the -art report, you know that there are projects but you maybe need a
chart which is the evolution over time and you do already have these
features out of the box.
By following the pathway of the advanced analytics now
WISB is turned into an AI powered platform which basically means we are
and with lots of AI to his research understanding the context the contents
and also building end -to -end flow which speed up a bit the work of the
researchers but let's see what we have today we prepared a case study which is
about advanced techniques in pet glycolysis I'm not an engineer and not a
chemical engineer and a mathematician so please be patient if you have that kind
of background but in short it is about breaking down a plastic product like a
bottle of water into its raw components in order to use these raw components to
build a new plastic product so this is about plastic recyclment something we
are probably and hopefully or all confident about it's about having a
better world and well probably something which is easy to understand for all of
us and it's a real use case so what I mean by real use case we do usually
interact with end users of the platform which are to understand which are the
challenges mainly and this was a trigger for a lot of these features that we
we develop, that's also why I choose this.
We will see an assistant or an agent as
you want to call it to make technological discovery within the
platform so we will have an LLM embedded into the application and then based on
the outcome provided by the LLM we will choose how to create the database, we
will do it in natural language and we will see that in the back -end this is
translated into a complex query.
And then we will use AI again to understand the
specific contents.
It will be a very complex patent and we will have this
smart summary features to understand which is its contents.
And then finally
what it is most interesting for me, the innovation evaluation tool, which is a
tool which leverages the data of the platform to assess the innovativity
potential of a project idea and it leverages AI as a natural interface with
the human and several data -driven tools like machine learning tools and we will
we will see that.
This is the landing page of WSBI.
In here we can see all the
the datasets that I have mentioned, the patents, the papers, projects, and so on.
Let's start from the WSBI agent.
So here we are conversating with an LLM and we can use it for making technology discovery.
So let's say I'm going to ask something, which, okay, pretty easy question.
Why it is important to have an AI embedded into an application just, you know, for not
doing like back and forth with the application in Google, the application in ChargeGPT, maybe you
don't have a ChargeGPT license and you want a reliable LLM you want to speak with.
Again,
this was one of the challenges that were posed by our end users and this is simply integrated here.
Of course, this step usually requires several iterations, so you want to ask something,
then you want to deep dive.
This is something we are not going to do today.
Let's say we find an argument which is interesting for us, which is microwave
assisted glycolysis in this case.
We move to the specific data sets in which we
are interested in, which in this case is the patents as we have said, and we want
to query the database.
We are probably not database experts and we can use
this natural language search capability.
The reason why I did add pet there is
because pet is an ambiguous word, so usually this, in the past at least, this
was giving results about like the animal world, like research
about veterinary thing.
And we also have microwave which is an
ambiguous word which could lead to things which relate, you know, house
devices and this kind of thing.
So what we want to see here, then if we run this
query, the system in the backend is translating it into something which is a
bit more complex.
What we have in the backend here is a genetic pipeline which
basically understands the concept and first of all it validates that it is not
like a malicious or injection attempt.
Then it is understanding the context of
the research and it is extracting the main concept, then enriching it with
synonyms but why this is interesting because as we can see by analyzing
quickly the result everything is now related to chemical chemical recycling
chemical recycling chemical recycling so basically why we are embedding AI in it
because usually I mean basically to democratize the access to the data by
this platform so not everyone is very confident with database
query and in this way we can simply use natural language to ask things to the
database and at this stage once we have identified the context of a research and
we target the right contents maybe we want to open a content again we are
checking patents and yeah probably we have to go through the complex text of a
patent which is usually long and complex again.
So what we added here is to have
again an AI to generate a smart summary which is I mean summary of the content
itself providing a structured information which gives yeah a
description which are the main claims which is the problem that the patents
target to address and well at this stage usually again this is an iterative
process so you want to understand which is the context check check patents check
papers check projects at some point you come back to your team and you maybe
have an idea about how to move forward with respect to this patent or in
in general to the state of the art and you're not really sure that your idea is a good one
in terms of eligibility and what we are working now is a workflow which I have run previously
since it takes like two or three minutes and I was sure we didn't have the time for doing
it but basically we have this section workflow in which you can access this innovation evaluation
tool and then you insert the project idea you press run and after three
minutes you get an assessment the assessment is based on the data of the
platform and it basically compare your idea with benchmark of projects papers
and it provides an overall score of
innovativity and a radar chart which
displays the different drivers.
Basically for us something it is
innovative if it is original.
So this
novelty drivers basically measures how
much this idea is different from what we
have in the database and if it is
aligned with or better if the topics of this idea describe increasing or
decreasing trends and if the idea is useful what does it mean useful for us
is if there are properly budgeted funded opportunities for which suits for
this idea and which is the impacts.
What the LLM is doing here is basically
providing just the natural language interface, so the analysis itself is not
performed by the LLM.
We have other analytics tools which leverage machine
learning or statistics to get the numbers and then the LLM is simply
describing which is the outcome of the analysis to the user.
And let's see how a
couple of drivers works and then we move ahead so how to assess if something it
is original we perform a similarity search against the database and we
retrieve which are the most similar projects papers and patents we have the
similarity is computed by using vectorizers models and we are basically
mixing the information about how much something is similar or dissimilar to
idea with the year in which the project so basically how much time it passed
since this idea was released and this benchmark set basically defines all
these scores for the projects and the average score of the benchmark set is
the score that our idea is getting so in this example we have 40 so it is not
that innovative because it seems to be innovative when comparing it to the
patents but definitely we have a lot of projects in the same area so this is
probably something that we want to to improve and iterate again the process
study I've described before.
Trend alignment, again the question here is if
this idea is aligned with trends which are increasing and again this is
computed by benchmarking data in the database.
So in here we can see which is
the distinguish between the data itself which comes from the database, those are
reliable, this is auditable and I mean basically trusted data and what the LLM
is doing is just explaining the content of the data so it's natural language
interface with the data itself and let's switch to the maybe I mean just to the
last part so once we have understood that our idea is maybe original it is
is probably aligned with increasing trends, we are searching for funding opportunities.
Then we switch to this third tab in which the system is suggesting codes which could
fund this kind of activity, and of course these are also part of the database, so the
user can select it and then check if this is aligned with the scope of the projects.
And well, this is all that I wanted to share, trying to get a very quick summary.
We used AI in several ways, starting from an assistant, so again, having a sparring
partner to conversate about technologies then identify a topic of research and
using AI to properly query the database then when we analyze a specific
content we are interested in understanding it better and using AI in
it and then finally we have these big end -to -end workflows which leverage
machine learning, AI and I mean analytics techniques to provide
explainable information.
I know it was a lot in a very short time, I hope it
was clear and yeah please.