Towards Green ML-Enabled Systems: A Software Architecture Perspective

Introduction

Hi everyone, my name is Justus. Happy to be here.

I'm an assistant professor at the FÜ, the Freie Universität. I think that makes me the theoretical speaker for today, but let's see.

So I want to talk to you today about green machine learning enabled systems. And I will take a software architecture perspective on this. I will explain what I mean with this in a minute.

Background

First, a few words about myself. So I joined the FÜ last year in August.

Before that, I was a postdoc in Germany at the University of Stuttgart, where I led a division that was called software engineering for AI and microservice-based systems. The latter part of that name is due to my PhD being on microservices, so it was an architecture-centric topic.

But as you might imagine, that's not why I'm here today.

My research interests are pretty broad, but usually I'm focusing on software architecture. I'm focusing on some quality attributes that we would characterize as internal software quality.

One of those is environmental sustainability. I will talk more about this later.

And then these days I'm very focused on AI-based systems. So I'm not an AI researcher or machine learning researcher. I'm a software engineering researcher. So I try to take a system-centric perspective on this, not just a model-centric one.

Speaker's background

So since I will be talking about software architecture, let's first revisit a bit what that is to bring everyone on the same page. So I won't bore you too much with a lot of definitions, but I briefly wanted to give you two perspectives on how people see this.

So one definition is from a fairly classical book in software architecture, which is called Software Architecture in Practice. And this is a very structure-oriented definition. So Lenbas and colleagues, they define software architecture kind of as a set of structures that you need to reason about a system.

So what are these structures? So there are software elements, a very generic term for any element that you can have in a software system, then relations among them, so how are they connected to each other, and then properties of both.

However, this is kind of like the old view of software architecture and a more modern view on this is a more rational oriented, so a perspective focused on decisions. So here's a definition from Janssen and Bosch and for them software architecture is kind of the composition of a set of architectural design decisions.

1So one reason why AI engineering, so building products with AI and machine learning is still so difficult is because reusable architectural knowledge in this space is still in its infancy. So it's just starting to build up.

We have a lot of architectural knowledge for traditional quote unquote systems, but not so much for machine learning and AI. So I guess we are out of the infancy here, but yeah, we haven't made it out of adolescence.

So a lot of knowledge is still building up. and we need a lot of questions answered.

Okay, so now we revisited what software architecture is.

Green AI

I want to talk to you now about green AI by taking a software architecture perspective. So we'll talk about the environmental sustainability of these systems. And the results of this talk are mostly based on a paper that we published this year at the International Conference on Software Engineering in the Software Engineering and Society track. So it's with colleagues from the FU, but also from the United States, from the Carly E. Mellon Software Engineering Institute.

So I probably don't need to convince you that machine learning consumes a lot of energy. So you may have also heard some of these statistics that if you use chat GPT to answer a simple query that Google search might also be able to answer how much more energy and water you consume by doing that.

What I always find compelling is this statistic that you might also have heard that a typical natural language processing model based on transformers consumes or leads to greenhouse gas emissions during training that are the same as or comparable to five cars over their lifetime.

And what I find always crazy about this statistic is that it's from 2019. So it's way before generative AI and ChatGPT took off. So you can imagine how the playing field has changed during all these years.

So of course, when you think about machine learning, the most energy intensive part is the training, especially when we talk about generative AI and large language models. This is typically what consumes a ton of energy, because you simply use a lot of data. It takes a lot of time. It takes weeks, for example, to train some large language models. And this is just very energy intense.

However, the inference part, so actually using the model at scale, can also be problematic. So I always like this figure here. It's also a bit outdated by now. It's from 2022. So they probably use 2021 data, which means that this 1.5% is out the window. Unfortunately, this bar is also a lot more blue. So AI is much more responsible for energy consumption and carbon emission these days.

But I always like the lower part of the figure. So on the right side, you can see the initial development where the people, the ML engineering teams come up with the initial model so that per instance consumes a lot of energy but it doesn't happen so often so it's infrequent to rare. Then you might have the retraining when the system is actually in production so that is a bit more frequent but it also doesn't consume so much energy. And then you have the inference, which per instance consumes the least energy, but you can now imagine if it happens at internet scale, then of course it can still be a problem. So inference, especially with large models, consumes much more energy than if you would do it with a deterministic traditional software algorithm.

So yeah, at scale with millions of concurrent users, this is still a problem. And as you might imagine, this current AI hype and boom kind of puts the climate goals that we have as nations but also as companies in peril.

So here's, for example, Microsoft. They had the lofty goal of being carbon negative. or carbon neutral by 2030. They announced this in 2019, I think.

And as you can see, they're actually going in the opposite direction. So from 2020 to 2023, their carbon emissions have increased by roughly 30%. I don't even want to know how the dot for 24 would look like if it's even still on this chart. But yeah, it's not going well.

They are not the only ones, of course. Here's Google, for example. It's the same. They also wanted to be carbon neutral. neutral by 2030, and they increased their ambitions by roughly 50% even since 2019. So yeah, it doesn't look good.

Red AI vs Green AI

So now you might ask, is green AI coming to the rescue here to save us? So green AI goes back to a distinction from Schwarz et al. So they differentiate between red AI and green AI.

So red AI for them is AI in research and practice. that only hones in on accuracy, so on prediction quality. So that spends considerable computational power and resources to only improve the last percentage of accuracy without any concerns for efficiency or costs.

On the other hand, green AI still tries to get sufficient results with respect to accuracy. So this is still important. But now efficiency and computational costs also enter the picture. So now you try to reduce the resources that you have to spend for getting a certain level of accuracy.

So you try to promote approaches that give you a favorable trade-off with respect to accuracy and energy consumption. Maybe you sacrifice a little accuracy, but you gain a tremendous reduction in energy consumption. So this is the idea here.

So the good news in this area is that this kind of research, so green AI research, is definitely making progress. So a recent review identified close to 100 papers on this. By now, this has probably doubled.

So yeah, we are identifying techniques that can help. So this is good. However, in practice, it is still kind of difficult to apply these techniques.

First, because they are kind of scattered across several scientific papers. And as you might imagine, in scientific papers, the techniques are not really in an actionable form. They are difficult to apply without considerable knowledge, especially for companies. So what we wanted to do is, with our research, we wanted to gather these techniques and make them more actionable so that people can get a quick overview of how to make their systems more environmentally sustainable.

How did we do this? We used the medium of architectural tactics. Architectural tactics are a concept from software architecture, and they refer to high-level design decisions that improve a certain quality attribute.

In our case, we were focused on green tactics. These are tactics that improve environmental sustainability for a certain class of systems, namely machine learning enabled systems.

So how did we find these tactics? So I won't bore you with the details, but just to mention there were two parts.

The first one was literature based. So we gathered literature on green AI, made a deep dive into it and extracted techniques that we think might be reusable. We categorized everything and refined it into an actionable catalog, and then we took it into a focus group.

So we recruited three experts on machine learning-enabled systems, discussed the catalog with them, presented everything, they made suggestions, we refined it together, and then the result was a catalog of 30 tactics for machine learning-enabled systems.

So before I present the catalog, I want to talk to you briefly about the best screen tactic of all of them. Maybe you already suspect what I'm on about, but the best screen tactic is actually only using AI when it makes sense to use AI. especially these days when companies throw chatbots at everything and when there's considerable hype.

I think it's very important to really stop for a moment and think, is what I'm using AI here for actually a good use of AI? And there are some guidelines that you can use for this.

So there's a great book from Geoff Houlton called Building Intelligent Systems. And he talks a bit about the right problem for AI.

So one good problem for AI is if you have a very large problem space. So you have a massive scale of options to choose from, for example. Or you might even have an open-ended problem space that is continuously growing, like books, movies, and so on.

So these are good cases for AI. Then there are some problems that are time changing, where a previously valid solution might get invalidated over time, like stock price predictions or ad placement. So these are also good use cases for AI because they are very complex.

And then lastly, there are some problems that are just intrinsically very hard. So this is, for example, producing and analyzing human text or speech, image recognition at scale, and then also some complex games like chess or Go. So all of these are kind of good problems for AI.

Then of course it's also very important when you use AI that imperfections and mistakes must be acceptable up to a certain level because it's unavoidable that they will happen with AI. So there will never be 100% accuracy when you use AI in machine learning and your clients or your company needs to be aware of this when making a decision to use AI. You can try to reduce hallucinations and so on from large language models but you will never be able to completely eliminate it.

So this needs to be also taken into account. And of course, AI is also very costly.

So it's also important to check alternatives that might be better suited. So AI really needs to do much better than alternatives like heuristics or human. And also don't forget to include the costs of mistakes.

Green AI tactics

Okay, so now let's talk about the catalog. So here it is.

We organized it across the lifecycle of machine learning. So there are some tactics that are fairly data-centric.

Then there are tactics that are more about the algorithm and model optimization, some about the training, and then some about the later stages, so about the deployment and so on and the management. I, of course, don't have so much time, so I will go into a few examples to make this a bit more concrete.

So one very effective tactic is called apply sampling techniques. So as you might imagine, the more data you use for the training of your model, the more energy it will consume, but also during inference because it will be larger. So if you can now use a clever sampling technique to reduce the size of your training data set, you can reduce the energy that it consumes considerably, and if you do it in a clever way, accuracy will not be harmed. at least not substantially. So there are some very nice sampling techniques that you can use that make it very effective, this tactic. So oftentimes you don't lose substantial accuracy but can gain tremendous energy savings.

Another tactic from this data-centric space is called reduce the number of data features. So similarly, the more features you use when training your machine learning model, the more energy it will consume. However, in many cases, especially if you have a lot of data features available, not all of them will improve accuracy substantially. So these features are called epsilon features, and if they don't contribute substantially, you should remove them. And of course, this will save energy. So it can also be a very effective way.

There are also different kinds of algorithms that consume different amounts of energy. So some are very energy hungry, like deem learning, for example. But there are also very efficient ones, like K and N, decision trees, linear models, also Naive Bayes. So oftentimes, it's a good idea to first try one of these efficient algorithms. And only if it doesn't provide sufficient accuracy, you go to a more powerful one.

And lastly, sometimes you probably don't even have to train your own model. Because a lot of companies have trained models on their own, and maybe you will find a model that is well suited to your use case. So you don't have to train a new one, but you can use one of those and maybe fine tune it. And that is called transfer learning. So that can also save energy during the training, not during the inference, of course, but at least during the training.

So yeah, these were four examples of tactics. There are, of course, many more. You can find them online also. There's a repository on Zenodo where there's also a PDF that you can read conveniently.

Yeah, so one problem about the tactics is that they still require domain-specific expertise for their selection. And they are also not universally applicable. So some are only situationally applicable, for example, when you use deep learning.

So we want to make the selection of these tactics also easier. It's also important that we study potential trade-offs because of course many companies won't accept substantial hits to the accuracy even if they can reduce energy consumption. So the goal is energy efficiency so that we keep the same level of accuracy but reduce energy consumption.

And lastly, right now, many of the tactics are still focused on these early stages of the machine learning lifecycle. And we definitely want to extend and refine this also towards more of the later stages so that we take the whole system into account.

Conclusion

So yeah, that's it. I hope I could give you an overview of what you can do in this space.

And I'm also happy to talk to you about this if you are interested.

Thank you.