The Agency Problem
We're not getting any closer to solving fundamental problems in artificial intelligence
The growth rate of artificially intelligent technology seems only to increase every day. Google recently released a language model that smashed state-of-the-art records in natural language processing. Computer vision models are approaching or surpassing human performance across a range of tasks. Just last month, OpenAI released GPT-2, a language model that can converse far better than any before, answer questions, translate text, and write sonnets in the style of Shakespeare. With the progress we've made in artificial intelligence over the last decade, it can't be long until full-fledged human-like AI (HAI) is living among us, right?
I think this is wrong. I don’t think human-like AI is very close. In fact, if you look at our current rate of progress and extrapolate from that, we'll reach HAI in the year... never.
How is this so? That's because we're missing—and haven't made any progress on—something that is essential to being human: a sense of agency. Most of the improvements in AI have all been focused on narrow fields. Computers are now able to use language exponentially better than they were only a short time before. Computers are also able to interpret images far beyond our capabilities of only five years ago.
Language and vision might seem like disparate fields that can only be solved by general intelligence. But they have a unifying connection—they both can be inferred by mastering statistical distributions. Deep learning—the technique behind the explosion in AI—is a masterful algorithm for learning statistical distributions. However, it seems that there are fundamental parts to HAI that are not statistical distributions. So far, it seems that deep learning will hit a wall on learning these things.
Deep learning is an important technology, and studying it isn’t like building a taller ladder to get to the moon. I don’t think this is quite right. It's not that we're building a taller ladder to get to the moon. The progress we’re making is significant, and it is necessary. It’s more like we’re building the engine, but if we want to get to the moon we need fuel, onboard computers, and guidance and control (oh, and a launch escape system wouldn’t be a bad idea either). So far, I'm seeing a lot of good progress on the engine (deep learning), but none on the fuel (agency), guidance and control (emotions), or other parts.
There are fundamental differences between what we've seen AI do and what we expect HAI to do. And it's not just that we're not there yet—as far as I can tell, we're not making any substantial progress in these fields. For example, I have an innate sense that I am a being, that I can make decisions, that I exist. So far, despite all the incredible progress we have made, we haven't made any progress towards a computer meaningfully referring to itself as "I".
So what's missing? A sense of agency. By agency, I mean imbuing AI with the sense that it is an agent able to make decisions of its own. Able to have its own thoughts. None of the fancy AIs we have seen have taken so much as a step in that direction. Part of having agency is realizing that you are a self. Not only that, but being conscious. I don't see any convincing evidence we're significantly closer to consciousness than we were ten years ago, or 50 years ago. As far as I can tell, we’re still at square one.
Another thing that sets humans apart is motivation. I have a reason to get out of bed in the morning. So far, no AI does. AI has no will, no desires. They’ve never changed their minds, they can't change their goals.
We don’t know how motivation would work with HAI. Sure, we use rewards for reinforcement learning, but it’s not the same. You can even call the reward dopamine and say you really, really like getting a hit of this stuff, but it's all not real. There's no evidence that computers have ever had any type of subjective experience. You could say, “See these bits I put in your memory? This is your happiness number. You really want to maximize it.” But we haven't figured out any intrinsic motivation. We haven’t found dopamine and serotonin, and we know it feels good to you but we haven't figured out what exactly that you is.
Currently, reinforcement algorithms work, roughly speaking, by assigning a part of a computer's memory as its "happiness", and telling it that it "wants" to maximize that score. This works in some ways - it may find some clever optimization path towards doing that. It may make the same chess moves as if it cared a lot, but, under the hood, it doesn't actually care at all. And that's a fundamental difference. Fundamentally, we have no way of saying, you REALLY DO care about these bits saying that you are happy.
In the movie Ex Machina, Eva wants to be free. In the Terminator series, Skynet wants to exist. Sure, Deep Blue can win at chess, but does it want to? Does it care? After the match, Garry Kasparov famously quipped, “Well, at least it didn’t enjoy beating me.” And he's right - no matter how good Deep Blue is at chess, it still has no interest in playing chess. And over 20 years later, I'm not seeing any progress in that direction.
Here's a good test for this. Put a camera on a swivel that the computer can control. Get the computer to move the camera because it wants to. I've written control software for robots - I could make it move. I could even throw in some randomness, so it appears to move only when it wants to. But these are gimmicks. This isn't what I'm talking about. I'm talking about giving the robot all the parts to move around, and then it moves because it wants to. We have lots of ways to appear to do this, but no idea how to actually do it.
As well as motivation, I'm not seeing strong evidence of progress towards understanding. Even the models that are deft with language don't have the slightest notion of understanding. Is deep learning going to get us there? It doesn’t seem likely to by itself. It doesn’t seem like scaling the existing techniques will result in a true understanding. I doubt that GPT-3, 4, or 5 will be any closer to true understanding unless new techniques are added. We need a new paradigm (or paradigms). Fundamental understanding isn't somewhere on our manifold that we’re going to find with stochastic gradient descent (or any other optimizer).
To be clear, I'm not saying that we'll never reach true HAI. I'm saying that to do so, we must solve some very fundamental problems in intelligence that we, as of today, have made roughly zero progress on. Progress comes in spurts, and I am very skeptical of the idea that true fundamental scientific breakthroughs can be predicted with any certainty.
But this leaves important questions—namely, what is the path to real HAI? Is it a case of "fake it until you make it"? Possibly. I think that agency, like the mind itself, is a macroscopic phenomenon. At some level, we don't have our own agency. We don't have the ability for our bodies to use our own laws of physics. We can’t think a thought our brains aren’t capable of thinking. We can’t ask the neurons in our brains to use different physics or chemistry. So it's not that we have agency and they do not at a physical level. It's that there's a level of abstraction that we're referring to when we say "me", and, at this level of abstraction, I have agency. It's not impossible; a bunch of nonconscious atoms can form a collection that creates consciousness. But, as far as I can tell, we haven't made any progress in this direction.