So my name is José Antonio Silva. I live and I work here in Porto.
I lead the research team at Techscope. We are a Microsoft integrator. Most of our customers know us by our Microsoft experience.
So what I'm trying to bring you here today is some lessons learned from using our AI skills and using some products to make sure that the adoption curve is easier for those projects.
So as a small introduction, as I said, we are a group here in Porto. We are close to 200 people.
um only 12 i think are on the ai group the rest of the teams are spread around products and consulting on analytics and and that experience with analytics have actually forged our background so we started doing a lot of sharepoint 20 years ago later we start doing performance point and scorecards and that led us to the world of that warehousing and more recently Power BI.
So most of our customers are integrating data. And since we are doing some products on using AI, we start doing some side projects with analytics using AI skills.
And that started with the forecastings, anomaly detections, and that led to the creation of this team that nowadays are working more on the open AI stuff, LLMs, and computer vision.
What we are doing with data, it's very important for some of these case studies here. So these are some of our experiences of using this kind of technology to process documents.
We started at the beginning with invoices. That was our challenge initially, long before neural networks.
I think it was convolution networks or whatever we were using. and from there we start creating the product to take care of the workflow it's not about the automatic processing of stuff it's more about the distribution of work between computers and humans okay that led to a product nowadays we are doing a lot of things with this product not only invoices but we are doing a lot of other documents to extract information and not only
PDF documents, but also audio files or video files. Things we can do with this is like extraction of transcripts. 1I know a lot of APIs or services nowadays are doing transcripts automatically, but for specific scenarios you need to have a confirmation or correction of special names, entities, codes, and these automatic transcriptions are not good enough.
So we use the same concept, the same distribution of tasks to accelerate the adoption of this kind of scenarios.
And nowadays we are doing this for justice, for the courts to index a lot of the content that is being discussed in these scenarios. Not only we can do this for private meetings, but also for public meetings.
This is very useful to reuse this content correctly, for example, on the social networks to share quotes from a public event. So what we are trying to do here is accelerate all this work, trying to make sure that even if the AI is not 100% perfect, we are able to start integrating data, but allow for the content reviewers to work in parallel and help make this more give more confidence to the customer that they can use the data that is being processed by the AI.
I think the lesson here and the things we want to share with you is this concept of the human in the loop and how we can integrate both teams, not only the AI team in the process, but also having even the end customer working on these projects.
The second case study here is not very different. What we are doing here is We pick the documents that the Cameroon Municipal Report, they want to make public. But they want to make sure that all these processes, they don't have private information in them.
So what we are doing here is, once again, we are using the AI APIs in the world. Mostly it's Microsoft document imaging APIs. We are extracting the metadata from them.
We show this on the review station, so the review station is our UI to explore these documents. This part here is completely metadata-driven, and normally the document shows up here, so it's very easy to change what we want to collect.
And because the AI probably already collected most of this data, the work here for the reviewer is to check if everything is OK. And in this case, we want to show them how the anonymization will work so that radio button over there will hide the information on the document. After this, the next process is to completely remove the indexable text from the image and put these black boxes to guarantee that it's not possible to get the content that was there originally on that PDF file.
So these are more asynchronous processes. The same thing we can do for photos. The problem here is trying to hide license plates and some faces that are not from the government officials and people that were attending the event, for example, or photos taken on the street to make sure that we are abiding by the law.
And so this kind of experience is what we've been discussing here.
So the topic of the session is this concept of human in the loop. This is nothing new.
I think the challenge here most of the times is getting the right tool for the users. In our case, most of the times we are working with Office users, so we don't want to send them to a very specialized annotation tool like Datagram or whatever online tool that there is for that data scientist to use. We want to use the tools they are used to do inside their Office 365, for example, user experience.
And one of the things we found with We did the project for the courts.
If we save at the end, if we save the captions, the transcripts back to SharePoint, we get a lot of things for free. For example, Copilot in Office 365 is able to index all these Office documents and recordings and meetings.
it's very easy actually to build a chatbot to assist any lawyer working on that process it just has to have permissions to read that document and the rest is automatically done by by the products so what we are doing with this kind of work is working directly in the stores in the formats that the rest of the office family is is using so we don't have to build a chatbot project it's available for them immediately okay um so i think this is something we can we can take to our projects is um
If we start using this kind of tool, we can start putting on the review station, we can start putting documents that we are not even to do anything yet with AI. We can use them initially for humans to start classifying or annotating documents.
And as we develop the AI models, and this is then completely in parallel with our AI teams or analytics teams with AI using notebooks, MLOps, Databricks, whatever they are using, They can work on a completely different workflow.
And the project can be already being used by end users, can be deployed. And we can start to receive information and annotations as we work. So this accelerates the prototyping phase.
We are able to work on the quality, because at the beginning, probably going to use more human hours and less AI. But as we continue to improve, this will address a lot of the challenges.
And we can in certain scenarios where, for example, health care, where we cannot use AI directly, preparing everything and allowing for the the practitioner to evaluate the result actually is the only way of doing it without the proper certifications to use in healthcare.
Okay, so the second tip I want to give you is that most of the of these scenarios, and when we start talking with the developers, they always think about documents, documents from the beginning to the end.
For example, I remember more than 10 years ago, when we started doing this invoice extraction, we were always thinking about, OK, we can start from an image, use this Microsoft extensible documents that you know from the document, the docx kind of files, and we start putting everything inside, metadata. We can increment the document with a lot of data.
documents in Word are able to store XML inside the document. So that was our initial idea. We started using this document, and then we would move it through the stages, depending on the pipeline you want to implement.
You could go all the steps, and at the end, if you want to integrate with SAP, you could also annotate the document with the IDs that were created on SAP, whatever.
What we learned is that these documents can be huge. For example, in the transcript, the audio or the video can be way bigger than the information that you want to extract.
So if we separate these two layers, You can have a state document that is evolving as the workflow evolves.
And at the end, you produce other documents, like when you infer the OCR or when you, at the end, you infer the invoice data. You can store this in different documents.
And you can write once and read many times. So this scales much better on a scenario where you're using, for example, a cloud storage for these huge documents.
If this is an exam from a Pax machine, a Philips machine on a hospital, this can be like 40 gigabytes for just one exam. And if you're doing the review here, you're probably using some parts of that document to explore the exam, and then you produce the results and you integrate this.
Separating tasks from documents, it's also a very good approach to reduce the impact of using this kind of solutions.
So from here, I'm done. I don't know if you have any questions, but this is...