What I learned building with LLMs

Introduction

Today I'm going to cover what I've learned over the last 10, 15 years of working in the natural language processing space, now called AI because it's buzzword bingo, and specifically like LLMs, agentic workflows.

Agenda and Focus Areas

I'm going to cover two main topics. Fundamentals, nothing to do with technology really, more just good business cases.

Challenges with LLMs and Neo4j’s Role

And I'm going to touch a little bit about the pains we're probably all experiencing when working with language models. And of course, where Neo4j is starting to help with a lot of that.

Why Listen to Me?

First, most important question, why should you listen to this lanky soul?

Again, I've been working in the AI space for over 15 years now. Previously, I worked at, before Neo4j, I worked at NatWest Group. For four years, I was the product owner for machine learning, so I worked on the bank's chatbot, Quora.

Making Chatbots More Intelligent

And my main task there was make Quora more intelligent. Now, let's not open the can of worms about what intelligent means, but generally, it was just make answer more questions, be more personalised, use more data. I had a personal ambition of wanting Cora to be able to provide financial guidance.

Democratizing Financial Guidance

My pitch at that time was, what if we could provide financial guidance to everyone, not just those that could afford it. That was my mission, still kind of is to the day. We're still nowhere near it, but we're definitely getting there.

From Banking to Graphs

And long story short, of those four years of research, all paths led to grass, was the simple summary. I've joined Neo4j to spread the word of the graph gospel.

What Is Neo4j?

So hands up really quickly, who hasn't heard of graph databases? And who hasn't heard of Neo4j before? Cool, good.

Right, Neo4j is a graph database.

How Graph Databases Differ

We're gonna go into details in a little bit about what that means and how that's different. We've been around 17 years and we work at the philosophy of connecting data to provide new insights and value over relational databases.

Who Uses Neo4j

We work with 98% of the S&P 500.

We work with, in the UK alone, I work with all the banks for primarily most popular traditional use case would be like fraud detection. So detect fraud rings, PII detection.

Emerging Use Case: GraphRAG

And most popular, popping up recently, something called GraphRag, which I'll touch on in a second.

Scope: Beyond LLM Mechanics

But again, I'm not going to cover into too deep of a detail of like LLMs themselves, which is all agentic workflows, which is a whole can of worms in itself.

You know, there's KV caching, there's context window management. I assume the audience wasn't that today.

Foundations First: Getting the Basics Right

So I'm going to cover, like I said, two high areas. And the first one

Start with Clear Use Cases and Evaluation

I kind of want to just provide the fundamental advice of getting the basics right. So if I had to give one bit of advice, one takeaway is kind of get your strong foundation set. And that's usually around asking some simple questions of the use case.

Role as a Solutions Engineer

If I had to give you a stat, I would say nine out of ten clients I work with, sorry I should say, so currently I'm a solutions engineer, where The analogy I gave is I'm a bit like a Pokemon. Once the client is qualified and it's worth them using the tool, I get thrown in to basically make the project a success. That's when I get thrown in.

My sole purpose is, right, they've got a use case, they've got a path to revenue, they need technical help to deploy. And specifically, I've got a finance background, so it's usually companies like NatWest and Lloyd's where I work.

Common Gaps: People, Process, and Tech

Nine out of 10 clients I work with don't have all of this ticked. Parts, patches, but I'm sure we all experience, especially with large organizations, politics and department structure, people is usually actually the main blocker over the technology.

Identify Current Gaps and Tech Landscape

Specifically, what are the gaps in the current technology and how is Neo4j going to solve it? What's the current existing technologies and what gaps?

How do you solve it today?

RAG as a Primary Gap Area

And we're gonna touch on RAG in a minute as kind of the main thing we're finding is a big gap in the market for.

Define Q&A Corpora and Benchmarks

With language models and chatbots, which is kind of the primary use case we're talking about here, I assume, you'd be amazed at how no one has a clear set of question and answers.

How do we plan to evaluate and benchmark your new system that you're going to put in.

Accuracy Requirements in Regulated Industries

Especially in law, in finance, in a lot of regulated industries, you can't afford even 98% accuracy. It needs to be 100, and you need to be able to benchmark how you measure there.

And especially with the fast-moving pace of AI, what happens if you put a new model in? That might change the results of your system entirely.

Tie Testing to Success Metrics

So I kind of start here with the clients going, right, what's your question and answers, and how do you plan to test? ties into success and value metrics.

The cliche with engineers is we like the shiny technology, but we should really focus first on make sure you've got a business case, make sure that everyone's aligned. It can be frustrating, but it's true.

Functional and Non‑Functional Requirements

And then, of course, you've got some more technical functional and non-functional requirements like accuracy and latency of the system.

I'm sure we're experiencing with new chat UBTs like thinking mode, or if you have complex multi-agent architectures, which I can touch on in a minute. It adds a lot of latency, so we need to figure out if that's worth the cost of the compute calls and whatnot.

Data Modeling Matters Most

And then lastly, once you've kind of gone through these hurdles, then we can start talking about engineering, of which, of all of this, I'll just say the data model is the most important part. And I'll show you data models in a second.

State of GenAI: Promise and Pain

I'm sure we've all heard on LinkedIn and the news about various percentile failure ratings of GNI pilots. MIT got 95, this one's 71.

There's definitely an appetite, there's definitely hunger, there's definitely results, no doubt about it, but we can all agree it's hard. And data management, we believe, is one of the largest parts of the issue.

Validate the Business Case Before Building

So it's kind of use case and benchmarking and definitions and being really clear that it's worthwhile before you even pull the trigger.

Data Readiness: The Real Bottleneck

Once they get past that first kind of hurdle of the foundations, it's then, right, how's your data? I remember at NatWest, I originally proposed a project to essentially try and solve some of the data engineering and infrastructure issues. got no traction.

I slapped chatbot on it. It got fined. So, to the day, we're hinting the same problems.

It is a data engineering issue. Data is unorganised. It's not connected. It's not clean.

It's hard to find. How can you expect AI to use your data if we can't ourselves?

What Makes Data AI‑Ready

So, okay, what makes AI data ready? Context, flexibility, and standardization.

Breaking Down Silos for Context

At the moment, our organizations have tens, hundreds of data silos, right? And the equivalent would be like me asking one of you to answer a question with only your frontal lobe. You're not going to be able to do it.

You need all of your data in your mind, in your organisation, to really start to answer more complex questions.

Enter Knowledge Graphs

And so this is where knowledge graphs come in.

Relational vs. Property Graphs

So currently, traditionally, you have a relational database. Relational database, SQL, Postgres, Excel sheets. Believe it or not, there's a lot of organizations that run on Excel sheets as a database.

You have rows and columns, and then you have joins in between these tables.

We work fundamentally differently with something called a property graph, where we store data as nodes. relationships and within those nodes and relationships properties.

Why Relationships Matter

So you've got a person named Terry, brother, sister of, and all of a sudden you see how I keep talking about connections and relationships. It's this nature that allows unique and new ways of extracting information from your data.

We do something called traversing on these nodes and I can touch on that later. There we go.

Scaling Context Across the Enterprise

Now, if you imagine not just within a database itself, you now start connecting all your data together to get more context. Imagine now across your whole organisation, you can have 10 different databases of those and you connect all together.

Context is key. It's a phrase being used a lot at the moment.

What starts to happen when you start pairing your transactions with your customers' data and their information about their financial goals and so forth?

From RAG to GraphRAG

Which leads me on to something called graph rag. I'll touch on rag in a second.

But essentially what's been demonstrated is when you start using these retrieval techniques with graphs, we get a substantial increase in performance. And this is where AI agents come in.

Limits of Vanilla RAG

So at the moment, RAG, retrieval, augmented generation, essentially where we take documents, for example, or even structured data and we embed it, we give it a contextual understanding, and then we ask language models to go retrieve it. To no surprise, it's not very good because these chunks of text are kind of isolated in isolation when really your data always has extra context needed. So RAG is what's currently happening at the moment.

By coincidence, this is a legal example.

Local vs. Global Questions

But RAG does well if the answer you're looking for stays within the chunk. But as soon as you start asking questions like, we call it global questions, like find all employees of the area or, you know, how am I related to Dan and how is Dan related to...

Traversals Boost Answer Quality

You need to traverse all of your data, you need to all connect in order to answer those sort of things. And when you start doing this, this is why we're seeing performance of these chatbots spike, because they're getting context, they understand you can create a map of your data, like we showed earlier, so they know where to find it. And then because you get more rich information retrieved into the chatbot, it's giving better quality answers as well.

Three GenAI Patterns

For Gen-AI, we've got three use cases, text generation, RAG, and then a Gen-TIC.

RAG, like I said, achieve augmented generation. A customer asks a question. It goes into LLM. LLM uses a tool, vectorized, and then outputs the answer.

Agentic Workflows and Tool Use

You've got agentic workflows right now. We have your LLM as an agent in the middle. Ignore the planning and the memory. That's kind of advanced stuff, but you've got your tools.

This would be like your vector database or your current database. Neo4j sits there, and the agent decides to use that tool to retrieve data.

Knowledge Graphs in the Stack

You've got knowledge graphs. Skip that.

Performance at Scale

We're fast. I don't have time to cover that.

How GraphRAG Enhances Retrieval

This is a more specific example of why this is how current rag systems work where you just have a chunk of text. And now, as you can see with this example, this is what enables it.

Entity Extraction and Enrichment

These are called entities. So we kind of do the same processing step as as RAG currently, as Vector RAG, but we do this extra entity extraction step on top where we add all this richness and this is what allows all this extra performance basically.

Querying Rich, Connected Facts

So now you can start asking questions. What CEO was mentioned in the news and what was their company report? Bang, there it is.

Minimal Architecture Changes

So you don't really change your architecture. You just swap Neo4j out and now you've got this supercharged capability.

Live Demo and Community Resources

I'm going to throw a wild card to Rian's dismay.

At the end, I know we have drinks, but if you'd like to see me try in real time, build Ashford their own GraphRack chatbot in another 10 minutes, raise your hand. All right, the people who have spoken.

Actionable Steps for the Community

We even gave feedback about actionable steps for the community.

DIY: Build a GraphRAG Chatbot in 10 Minutes

So if you scan this QR code, this is a video of what I'm gonna show you later. You can do this yourself in under 10 minutes for free, where you build your own graph rag chat bot, and you'll be amazed at some of the stuff it can perform.

And I can show more examples later as a bonus round, if you're willing. All good? Yes. I can bring it up later.

Tools and Learning Paths

And things to get you started, I can show you later again, but LLM Graph Builder, it builds the graph for you with AI. No code needed.

If you want to get started with just RAG or just to get deep dived into this, we have our own Graph Academy and we have this for free on Aura to get started.

Wrap‑Up and Next Steps

And these slides will be sent after.

Q&A

A minute after, a bit early. Perfect. Any questions?