This is my second time. A couple of months ago, I attended the first event.
It's a shame that I never had a chance to attend. I think there's an equivalent event in Oxford and I never had a chance to attend this one.
Hopefully, we will be organizing more Oxford Cambridge events in the upcoming days.
I'm saying this because I started my PhD at Oxford Computer Science in 2010 and we had a tradition to organize co -events with Cambridge, in the context of natural language processing, the topic that I did my PhD in.
So I have multiple hats, but today I just want to talk about a recent project that we started.
And I will, as much as I can, will make it more practical. I will show a demonstration at the very end.
At Pomeroy College, I'm working with historians and lawyers on a quite interesting project called Quill Project. You're trying to apply AI on constitutional documents.
Here, I'm teaching AI literacy. I supervise more than 30 dissertations on several topics on generative AI, including emotion recognition,
multimodality, and multilingualism.
him. Caloria is a premium AWS partner globally, and I'm leading the AI team over there.
I did my PhD eight years ago. We have a spin -off company licensed with the Oxford University Innovation. Last summer in 2025, we started TradeComply.
I've been studying on use cases in general fair and finance for a long while. while and finally we started on a specific one to automate the procedures in international trade.
Now, we have a product out of this. It was seed funded and I will make a demonstration on this.
You don't need to be an expert in international trade. I will just explain in a quick way the basics.
There's a huge problem as you might assume. So this is a total volume for or compliance pending globally, and amount of fines for AML sanctions and violations.
Before this, that's a very simple presentation, but the concepts are important. Only for this purpose,
to automate the manual processing of letter of credit documents, There are dedicated teams at different scales at different banks and it takes days for these people to process these documents. I will be showing an example document in a moment.
This is a swift message. This message declares, it's like a constitution.
It declares the details of the trade, the rules between two parties, and experts in these banks needs to check, first needs to understand this message and then check a series of documents manually.
The documents that I'm talking about is documents like this. So this is a randomly selected document.
It's a trade between a company based in Saudi Arabia, trading some chemical products to another company in Istanbul, Turkey.
That's the reason why it's a mixture of some Arabic expressions, some logos, some text, semi -structured information, stamps, signatures.
In the ideal case, at least two experts words are reviewing these documents line by line to check if this document is complying with the swift message that I just showed, and also the international rules like UCP 600.
This is an example of UCP. This is the UCP 600 document actually. This is a six revision since 1933. 33 so it's going on this kind of international rules are being applied for a long while.
Let me show you a few constraints like rules, such as this is a rule, like this is one of the more than 30 rules. Again, this is like the constitution and it says that
a document may be signed by handwriting, fax, my signature, blah, blah, some other rules and experts need to be aware of
of this documentation, this SWIFT document, and using this knowledge, they need to manually check or verify, approve that kind of authentic documents. This is an authentic document.
So in the ideal case,
we needed to develop a system that can in an unsupervised way with no information coming from with no information coming from the company that is issuing this document,
we need to understand if there's a signature over stamp, so we need to separate them.
We need to understand this table, the information in the table, character by character with in the ideal case, a 100 percent accuracy at the first layer, and then using this information, we need to take some actions.
This is the second document. It lists the item being traded, and we need to understand these metrics, specific names, and all these details.
So question, what would be the ideal way to design a system that can understand hundreds of documents like this, to assist not replace the expert teams at banks?
Thanks.
What technology would you offer or suggest? Yes, please.
LLMs. Perfect. Yes. Bingo, LLMs.
But one or multiple and open source commercial, on -premise cloud, any more details? Okay. Excellent.
Yes. Excellent. We're going perfect. Yes. Any more comment?
Yeah, like this thing there, there's some standards built there. So that may be not available to open source. They may need some technical documentation. So we must put that reference also. Okay.
Then that this is because we can match with that standard. If standard is not matched, then specification is wrong. Yeah.
So you're saying like - There will be a language model. They may be not open source because this is a paid standard. Okay. There's a constraint then.
If it's not open source like in some countries, especially in these two countries, you can't disclose an information to Cloud. So they should run on -premise. Okay.
So I'm not saying that everything should be open source. It can be maybe in the ideal case, a mixture of OCR like classical methods, traditional machine learning methods, and definitely LLMs as you suggested, and also maybe some other techniques like more image processing, text matching. So just to say again,
we are not aware of this information in database. As you mentioned, multilingualism is another problem because if let's assume there are 200 countries in the world, there can be any between these countries. So these documents can be in any language, and also we need to detect any symbols, so on and so forth. Okay.
There are some academic work on this, by the way. As I'm using my academic hat, I did some research on this.
People are trying to solve the same problem as we're trying to do.
This is just like a traditional flow of letter of credit transaction. So there are two parties, importer, exporter. There's a contract between them.
Then the next step is importer is applying to issuing bank. bank is sending this message to the other bank in the other country. So these importer and exporter are obviously in separate countries.
I just wanted to check like this is a very important document. We need to give our solution. I'm not saying it's only an LLM, like it can be a mixture of a hybrid solution, all these details.
So this document is under UCP latest version, the one that I just showed you. There are some details like a date and place of expiry, beneficiary applicant, the currency and partial shipments.
This is very important as well. It's making the problem more challenging. In some cases, the shipment shouldn't visit several countries in the world.
So we should be aware of this as well. Again, we just designed a solution for this. I will be showing it to you in a moment.
In the following research paper of this one, obviously they offered a solution saying that they can implement a pipeline in the middle, and all these transactions can be relying
on this solution at the center. Then they're proposing this hybrid human -in -loop AI integration. So there are central bank conditions, country -specific laws, pass cases, AI checkers.
Then if they're integrating all these, if there are discrepancies, some expert human checkers are giving finalizing decision, and then it's continuing again. again.
So what we've done is, we didn't want to, again, on purpose, we never aim to replace the experts.
But there's a problem like the young generation are not that fancy with that kind of jobs. So it takes at least three, four years for someone to be at a level, at least at the minimum bare level to deal with these documents.
So last summer with a small team, we started to develop this solution.
So I will be selecting this Swift example, the exact document that I showed to you, and then I will be adding the other documents. Normally, in normal case, there's uploading these documents as a bundle.
This brings another challenge. So when I hit the button, in the ideal case, it will do the classification. If I upload a bundle, it also needs to segment these documents as well.
So the classification is the first step. Then we need to extract all the key information from documents like this.
Again, this document, third document, certificate of origin as a template, as you can see, is slightly different than the others. So there's no standard of these documents.
Like it can be in Portuguese, Korean, I don't know. Multilinguality matters a lot if you want to propose a solution which scales up.
We extract information, verify, but verifying one document is not enough. We need to also do cross -verification. If you extract some information from a specific document, this should comply with the same type of information in the other documents as well.
well and I've been working on NLP for 20 years and there is like one or a few specific topics like paraphrase expressions, like dealing with abbreviations, numerical expressions, dates, temporal expressions, financial expressions, and these documents are filled with full of these information, so we need to deal with that type of paraphrases as well.
Finally, it generates a verification report. port and in a moment cross -check is happening. All the solutions that is running behind the scene is running on -premise. It is just like for demonstration purposes, we have this front -end.
Give me 10 seconds. Meanwhile, I will be showing the other type of documents as well. So this is a document on cargo insurance certificate.
We don't know this company. We have no idea of this template or or the information in different fonts. We need to detect this properly in the signature as well. It takes some time for the cross -check validation.
Another one, original bill of lading, too many information. Even these pages are problems as well, like they're bringing some problems. problems and another one.
The cross -validation is like we're creating all the combinations of different documents. So if you have a hundreds of documents like you can compute what would be the complexity of the total calculation and it's done.
Thank you. So cheers. I will get the positive end.
So we have several outcomes. So this is a report.
So after the first iterations, we started to categorize the issues. The first issues are like document validation issues.
So two out of six documents successfully passed, and in the remaining obviously we need to give the reasons. This is yet another problem to make it transparent.
So it shows the document and details of the issue, and it's also like referring to the context, like why we're having a problem with the signature seal or imprint.
The second problem is lacking a signature. The third problem is showing no evidence of being signed. These are similar and we have have the more detailed reasons.
The second one is compliance. So according to field 31D, letter of credit expired prior to presentation and some other details.
Cross -check report, these are the problems like inconsistencies between documents.
We also like, at least we show that it can, can like these are the correct information that we are detecting properly.
This is also like because when we do demonstrations of the solution like if you don't show this people are like seeing it like what is this system is catching in practice, and these are the information that we detect.
So as far as we know like there is no existing solution like working similar to this. this and now I want to also show you
some open problems that we need to solve in the upcoming sprint plannings.
So yes, it's working properly but still. If like the UCP is not changing that frequent, but the other contexts is changing frequently. So this is like an important issue for us.
For regulatory change tracking, we are RAC system or RAC systems should be aware of this.
is evaluation is very important. So we are now upgrading our systematic evaluation system. So you can detect all the information in these documents properly, like I'm talking about the OCR or the recognition of tables,
stamps, and other details. But still, that's one type of evaluation, evaluation like the character level evaluation, maybe table level evaluation, it's multi -layered.
Also, we're applying the general benchmarking metrics to do this, how many information exists in this document, and we're just applying the usual accuracy
F1 measures to detect this. Then the second layer is document level evaluation, evaluation and the whole case level evaluation as well. So if you're dealing with a bank is like
an average bank is dealing with around a few thousand cases yearly. So we need to have enough number of cases like this to have a deep understanding of how the system is performing. forming.
Out of 10, the challenge of this specific set of documents to process is four. So it's not a challenging case.
There are more challenging cases like you need to do more and more verifications and for this reason, everyone's asking if the system is working with 100 percent accuracy.
Yes, if you're talking about character recognition, technician, but know if you're talking about the evaluation of the whole case.
Another thing is multi -modality. As I said, these documents are consisting of not only text, but also information in other modalities. So we need to be aware of this.
Explainability and trust, it needs to be informative, and it should learn case -by -case in the ideal case, case, and this is yet another thing that we're working on.
I hope it's clear.
So hopefully this system is being piloted by multiple financial institutions at the moment. It's really challenging to get into.
1Even if the whole system is running on -premise, in a financial institution, it's like there are at least three different groups that are doing the due diligence for or that kind of a process like on security one, quality assurance, and those AI teams are just checking.
Finally, all of our meetings have gone successful. So we passed all these stages.
Again, I think the potential for this product is the global scale. So we never designed it for a specific country or for laws in a specific country.
So it's designed in a universal way, and it's quite easy to expand to other legislations, like other countries and other type of documents as well.
Thirteen years ago, I was working at a company and I built the R &D team there, and we were using another OCR solution. It was an open source OCR solution.
After 13, 14 years, we're still using the same solution. It's working okay, but maybe it's still like less than 5 -10 percent of the whole pipeline.
So we just edit many multi -reg, multi -agent system with long memory and reasoning ability. So the traditional methods are not gone. done, they're still part of our solution.
But I would say the open source VLMs and LLMs, like a multi -modal architecture is about to be disrupt this specific sector or units in financial institutions.
So that's what I can say. Thank you very much.