AI-Augmented QA: Leveraging AI to Match the Speed of Modern Development

Introduction

Who I Am and Where I’ve Worked

Yeah, so today I will speak about a year -augmented QA approach, which I designed,

which helped me to leverage in the speed of modern development.

My name is Alexey.

I work as QA for more than 30 years.

I've built several QA departments from scratch.

I used to work as in big enterprise -sized companies like RingCentral,

and currently I'm working as a solo key in an actively growing startup with more than 10 developers.

And also the last three years I have a second full -time job by being a dad

and this appeared to be much more harder than a regular one.

So shout out to all parents.

The Documentation Challenge in Startups

So as I told, I used to work in different scale companies

companies, and almost everywhere I was facing a challenge to keep your test documentation

in good shape.

But the most challenging is in a startup.

Why?

It's because startup usually

have lack of resource.

In my experience, maximum one QA or zero.

I tend to be the first one

one in a few cases.

Why Documentation Falls Behind

Everyone is overwhelmed with work, rapid changes, startup can pivot in a whole direction or

change features, delivery always prioritized over the documentation and processes.

Also from the key perspective, testing and fixing bugs also takes priority over the documentation

And usually writing or updating documentation seems like an optional overhead.

Consequences of Poor Test Documentation

As a result, it's hard to plan your testing.

It's hard to plan your regression.

You have more bugs.

You can lose context over time.

So it's often we ask ourselves why we developed something and now remember.

member, and it's hard to track your automation coverage

tracking because you don't have a list of all requirements

feature anywhere.

From Tester to Process Overseer

So how we solved this in Edge.

So we decided to be more not a tester,

but an overseer of process writing testing notes

for developers, testing guides, doing trainings,

links, etc.

But this appeared to be also challenging in the startup because everything changed too

fast.

You can't keep the pace.

Why Use LLMs for QA Documentation

So this challenge seems to be a good candidate to be solved by an LLM because LLM is good

to work with text and text documentation is basically text.

So to solve this, I've designed an approach.

I believe everyone can implement it in their projects or companies.

The RAG-Augmented QA System

So in my case, I've designed an application with back -end and front -end dashboard and

a rack.

If someone doesn't know, it's retrieval augmented generation vector database with LLM, which

can help you to get context for LLM requests.

Sources, Control, and Governance

My source of requirements is Jira, and I was thinking a lot how to keep control over everything,

I think, to not miss hallucination and stuff.

So that's why I built a dashboard where

I approve in testing notes.

And I was thinking how to keep control under testing

documentation changes.

And the solution was, on surface,

to use GitHub as a storage of test documentation.

documentation.

So each change to some of the test documentation will go through pull request, which you can

review, approve, edit, or decline.

End-to-End Flow: From Jira to Testing Notes

So in my flow, when ticket with requirements move to in progress, which means that requirements

requirements are complete.

Usually, not always.

The webhook goes to my application, but can

get chunks from vector database, goes to LLM, generate testing notes for developer, which

are, after my approval, which are going to, as a comment, to JIRA.

So while developing

a task, the developer always can check what he needs to think of while development.

Automated Pull Requests and Documentation Updates

And also it creates a pull request in GitHub with changes to documentation.

This merge request is also done by LLM.

It analyzes existing documentation, analyzes testing notes, and decides which files and

what needs to change in the corresponding test documentation.

Demo Walkthrough

So I recorded a small demo.

Some API calls to LLM took a while, so I recorded it.

Yeah, so here my test Jira project with some task with some requirements to add task assignment

feature to test flow API.

So I'm moving it to next step.

Wait, I'm sorry.

I'll go back.

Yeah.

I'm moving it to next status.

Webhook goes to my dashboard where you can see the new task created.

I'm checking it and test notes already generated.

So I'm observing the test notes, here are new test cases created by LLM, some things

things to consider for developer and regression areas that could be affected.

So I'm okay with those notes, approving a draft.

Before Documentation Exists

To show the whole process, I will go to my RAC database and ask how to assign tasks to

a valid person.

There is no test documentation so far about it.

So, in retrieval, I will get unrelated chunks with low score, which can be used as a context

for LLM.

Creating and Approving PRs

So, LLM is generating, creating pull request, so it's getting created.

As we can see, it's created a pull request with new test cases for two files, for API

and web.

I am approving this pull request, and on approval, GitHub triggered a webhook to my application

with information about this pull request, which file was affected, and my application

application is reinvesting all these new files to update test documentation.

Improved Retrieval After Merge

So now when I ask how to assign task to user, it will get a different result, much higher

scores, related chunks, related assigned task to feature.

And in future, when I will have a ticket about changes in some task assigned feature, it

It will give me correct context and correct testing notes with a new...

So the cycle is finished.

So we are getting up -to -date test documentation automatically.

Key Lessons Learned

So some takeaways I've learned.

First one, not in the list, this approach works already in production, it really helps.

Second, 20 % of integration efforts bring 80 % of results.

Apply the 80/20 Rule and Avoid Over-Engineering

At the beginning I stuck in development trying to make this tool smart enough to predict

all corner cases and this led to overcomplicating the tool, a lot of bugs started to pop up.

so sometimes better to give some places under your control for example if you have some cases

some case in five percent of all cases you can handle it by yourself rather than

over complicate your system and get an unreliable result second one is

is incremental prompt improvement.

So I've started with simple prompts,

and I've used the next technique to improve it in each iteration.

So every time I saw some deviation in LLM response,

I've extracted this behavioral pattern

and used this as a guideline what to do

and what not to do in prompt.

So next time LLM when getting instruction,

instruction, it will get a strict instruction not to do this way, do other way, and this

helps.

So first 20 times it was bad, but then it started to work good.

Next, test documentation.

Design Test Docs for RAG and LLM Consumption

Test documentation before was for humans or for testers.

But now you need to think also how it will be read by LLM or REG.

So it should be designed REG LLM friendly.

For example, you should consider the chunk size.

So vector database store information in chunks, so you need to design your document the way

It can be divided to small but meaningful chunks

So you will not end up in situation when you have a huge test case and it will be split it

Between two two chunks.

So this this was a challenge

Challenge at the beginning chunks was awful now.

They're also not so great

but there is a

room on improvement in in this and

And the last one, they are all also related.

Use Metadata and Tags to Improve Retrieval

Chunk could be not only the piece of text, but it could also have metadata, different

tags.

For example, if you have chunk test case, you can add tag priority, you can add tag

feature, which feature it belongs.

So, for example, you will have API and web test case belonging to the same feature as

I saw like task assignment.

You can have a feature tag task assignment and ARG system will do better ritual because

it will know that those chunks are related.

And yeah, that's pretty much it.

It's pretty much it.

Wrapping Up

I've tried to design this lightweight.

It can be even more easy.

You can use N810 or other tools.

There are plenty of them on the market now.

I was developing it because I wanted to.

And yeah, that's pretty much it.

Thank you for attention.

A Nudge to Get Started

And if you don't have test documentation now, maybe it's a sign to start doing it.

And it will help you in the future.

Q&A

Is there any questions?