Do Large Language Models (LLMs) have minds like humans?

Introduction

All right, hello. Hi, everybody.

Joshua's presentation was pretty cool. I'm kind of feeling a little bit embarrassed about what mine looks like.

Speaker Introduction

But my name is Amir. I'm a senior analyst. I work in the data industry.

I'm also a professional philosopher. I have a PhD in philosophy.

Main Topic: Do Large Language Models Have Minds?

My area of expertise is in mathematical philosophy and logic, metaphysics, and most recently I've been doing some research on philosophy of AI, which is kind of the topic of this talk, the question of whether or not large language models have minds, right? That's a very natural question that comes to mind, especially for people who engage with these kind of things pretty often.

Framework: Folk Psychology

The kind of framework I'll be talking about, this sort of thing, is called folk psychology. So we have different theories of mind and philosophy. One of them is called folk psychology, where we kind of talk about common sense understanding of mind, what it means to have a mind or belief or mental states.

And there are different kinds of frameworks about the question of kind of mental states. But for the purposes of this particular talk, I'll just talk about, I'll kind of look through the lens of folk psychology when we are discussing this question, right?

All right, so like I said, folk psychology is a common framework that a lot of philosophers and psychologists use, but it's not the only one. And this talk is particularly based on some recent research by Simon Goldstein. As a mainstream philosopher, he's one of my former colleagues.

And some other research is being done, including by myself, but that's not the kind of thing we'll be kind of exploring in this talk. All right, so like I mentioned, folk psychology is a common framework that a lot of philosophers and psychologists use.

Key Features of Folk Psychology

To address the question of, do you hear me if I stay back a little bit? Like, is it loud enough now? All right, perfect.

So folk psychology is just one framework. It's a pretty common one that philosophers and psychologists use.

According to folk psychology, there are two main features that any agent or system should have in order to be considered as having minds. So when we want to ask the question of whether or not large language models have mind, we first want to know what does it mean to have mind.

According to folk psychology, There should be two features.

One of them is having mental representations or world representations of the external world, essentially, right? Where the agent or the system should be able to perform and hold images of the external world.

And then the other one is... dispositions to act. So once that you have those images about the external world, should be able to have desires or beliefs in a way that are kind of caused by those mental images or representations and kind of act based on those.

Mental Representations

So for this particular talk, because we don't have too much time, We will only be talking about the mental representation bit.

I'll also mention briefly the other one. But basically, what I'm just going to talk about for the rest of this presentation is

Research on Mental Representations

some research that's done by AI researchers, especially in probing research recently, that kind of conforms to some philosophical theories about what it means to have mental representations, right? So again, we have different theories about what it means to have mental representation. So we started with different theories of mind.

We kind of settled on what folk psychology is, or just like a brief mention of what it is and what it requires to kind of consider an agent or a system as having mind. And then from there, we mentioned that one of the key components have mental representations or world representations. And then what does it mean to have world representations?

Again, there are different theories about that, but one of them that the only one that I'm gonna kind of go through a little bit in the stock is called the information theories of, of representation that was a very common, still popular view about what it means to represent things in the external world. But it was initially developed in 80s and 90s by people like Driske and Usher.

Basically, these theories are pretty much based on probabilities. According to them, a state carries specific information. A state carries specific information if it makes a condition almost certain, so very high probability. And this certainty is caused by that condition.

So to illustrate that, imagine you have a thermostat, right, that is equipped with a sensor that is kind of representing the room temperature as 90 degrees Celsius, as you can see in the example. Now, according to the information theories, for the thermostat's reading to carry specific information that the room temperature is actually 19 degrees, and in that way to represent it, there must be a near certainty or a very high probability that the room's temperature is indeed 90 degrees Celsius. 1So essentially what it means is that accurate representation depends on a high probability that a state, in this case the sensor's reading, is caused by the actual condition, in this case the room's temperature.

Probing Research in AI

All right, so, so far we have this particular philosophical theory of what it means to have mental representations or worldly representations. And on the other hand, we have a whole bit of research that's being done in the AI industry itself. And there's this called probing research or research on interpretation of basically these neural networks and large language models.

And there seems to be a very close connection, as we'll see in a second, between the evidence that comes from these basically the AI research, the probing research, and what is the kind of philosophical theories that we have about having mental representations. And in this particular case, we just talked about the information theories, right?

Probing Methodology

Anyway, so to see what probing is, so basically imagine we have like a vision system. vision AI model, right, where it's kind of trained on the whole bunch of images, including images of cats and dogs and all kinds of things.

When the AI is now, so let's say that's the training bit, right? So you kind of trained the model on all these kinds of images.

And then after it is trained, when you present it with an image of a cat, let's say, it has this internal activation.

So, like, the whole neural network has, like, all these, like, layers of neurons inside of them. And whatever input you feed it, right, so in this case, again, let's say it's an image of a cat, there's going to be specific activations inside the network, right?

And those internal activations are essentially complex numerical patterns that kind of represent different features of... the image, such as the edges, shapes, and texture, and so on.

Now, so we have this initial model. For each image, it produces these internal activations.

Now, the probe classifier is a separate model that is trained on the particular activations that are caused by those images. So importantly, we don't feel... researchers don't feed these or don't train these probing models on the images themselves. They don't touch the images. They train them on the internal activations of these models, right?

And in that way, then make it predict what it is that it's presenting, right? So it's kind of reverse engineering the process in a way. And, um, Yeah, so if the probe can accurately predict that the AI system is looking at a cat based only on the activations of the model.

Excuse me.

Accuracy and Implications

Basically, and those activations are very temporary and they're formed for whatever input you feed into the initial model. With really high accuracy, in a reliable way, they predict what image it is presenting, right? based on the activations, that would be a pretty good sign, like according to the information theories of representation that I just went through a few minutes ago, that those internal activations are in a way representations of the external world.

In this case, the world is like the world of images of, let's say, animals, right? That's the kind of world we are kind of training the model on.

And essentially, what happens is that the whole cat example is just to illustrate, but actually there's a lot of similar research that's done on different scenarios, such as board games. without feeding the board game rules to the models to kind of get the activations of the legal moves that are caused by the laws, but not the laws themselves, the rules themselves, and to kind of, again, train a classifier, a probing classifier on those activations and make predictions, and again,

there has been very successful accuracy of the prediction of these probing classifiers.

Philosophical Theories and AI

Anyway, so to summarize, what all this means is that for this particular account of what it means to have mental representations, which is the information theories of representation as one of the theories that I mentioned, there seems to be good evidence from AI research or probing research specifically that is pretty recent, so it's like 2023, 2024.

A bunch of papers have come out that shows that these models, especially the GPT-based ones, and some models from Google as well, they seem to represent the world in a way that these philosophical accounts of representation predict or expect them to.

Now, that was just one, more like a hint of a one account of what it means to have mental representations, namely information theories.

Other Philosophical Theories

We have a bunch of other theories in philosophy that are pretty prominent. But for each and every one of those as well, we don't have time to go through them. For each and every one of them, there seems to be similar evidence that is out there that, again, from like probing research, that satisfies the kind of conditions that are imposed by those particular theories as conditions of having minds, right?

Now, so bottom line, all of this really boils down to this. We have all these main players of what it means to have mental representations from a philosophical perspective. There's decades of research of what it means to have mental representations.

And for each and every one of them, there seems to be good evidence from AI research, especially probing research, that that pretty much satisfies the conditions that are imposed by any of those particular theories. And now that that is the case, that seems to suggest that basically these models do have mental representations.

Conclusion

Of course, the scale is pretty low. The world, right, so the world representation, the world is pretty small in these cases. The research that's been done is the sport games or images of cats and dogs. or different concepts such as the color, direction, time and so on.

But essentially you can think about it as more like a baby that's kind of developing a mind or like growing up as more and more powerful these models become, the more and more capacity they show of like having mental representations of the world. So that's pretty much it.

Summary of Discussion

And I mentioned there's different theories of minds, different frameworks of asking the question of what it means to have minds. One thing that we kind of talked about was, folk psychology, but there's different kinds of theories.

There's computational theory of mind, functionalism, dualism, all kinds of theories. Philosophers like all these isms. They get paid to come up with all these theories for a living, but essentially there's a lot of accounts of what it means to have minds.

We kind of focus to see where we stand. We focus only on folk psychology just a little bit and the kind of conditions that it imposes in order for a system or an agent to qualify as having minds.

And again, because of the time constraints, we only talked about the mental representations and one specific account of what it means to have mental representations. But again, we have different kinds of it. essentially, so the kind of path you see over there, that's like where this whole talk was located. But that's like, this is like a map of possibilities, like all kinds of research that could be done, should be done down the road.

Future Directions

But essentially, so the whole purpose of this talk was to kind of give like a sense of the line of research that's happening out there. On the one hand, we have the AI research and the kind of a study of the activations of these models.

On the other hand, we have philosophical literature on concepts such as mind and consciousness and so on. There seems to be a good kind of correspondence, or at least this line of research seems promising, at least so far seems to be matching up to both sides. But depending on different accounts of mind, that could change.

Final Thoughts

Yeah, I think that's pretty much it. So one last few more words. I think the main takeaway of this whole conversation would be that um, LLMs do seem to exhibit signs of having minds.

And it's, I think I mentioned this, but that this, um, seems to go at a larger and larger scale and you never know one day they might actually wake up or something like that just by developing took a full blooded like mind, uh, as powerful as human mind or more, more powerful that that kind of is able to capture a lot more out there. Anyway, thank you. That's pretty much it.

Finished reading?