I previously wrote about how we don’t have a solid theoretical understanding of large language models (LLMs) and how we’re not able to predict their behavior. Our current method of learning about them is essentially to turn them on and see what they do. We simply don’t understand them beyond a superficial level. Being that LLMs are the most promising step we’ve seen so far on the path to artificial general intelligence (AGI), this doesn’t portend well for understanding AGI when it arrives.
This, I think, is concerning. AGI is likely to have a transformative impact on society; it behooves us to carefully think through what that impact will be. The common definition of AGI is an AI that can think and reason at a level that matches humans. I think this definition misleads us into thinking it will be a peer, which it certainly will not be. The moment the first AGI exists, it will be able to spread itself far and wide across the internet, creating thousands of copies. That’s thousands of AGIs who are just as good at developing AI as humans are. Except, of course, they operate at a much faster rate. I don’t think there’s any reason to believe its capabilities will stop in the human range. It seems far more likely that its capabilities will only match those of humans for a brief instant before surpassing them. Once it becomes better than us at building AIs, we will see a sudden and shocking explosion in AI capabilities. It wouldn't be long before an artificial superintelligence (ASI) arrives.
If ASI is possible, we should think seriously about what the world will be like when it arrives. What would it be like to suddenly be the second most intelligent “species” (I know that’s not the right word, but you get the idea) roaming the planet? I think in today’s interconnected world there’s a pretty clear line between being the smartest and the most powerful. So once something smarter comes along we’re no longer the top dog on planet Earth.
This will have many implications; in this post, I’m going to focus only on ASI and morality. Let’s start with the basic questions. Will it be a moral creature? And if so, what morals would it have?
I don't know if AI systems will develop their own senses of morality as they develop general intelligence. It’s quite possible that they will not. I think of intelligence and morality as existing on separate axes. We evolved morality as we evolved our intelligence, but these are two separate things. Different evolutionary pressures have formed and shaped them over time. General intelligence is a response to the need to be adaptable and develop generalist capabilities. But this alone wouldn’t result in morality. To evolve that, organisms must face evolutionary pressure to work in groups with other organisms, likely of their own species. So it seems possible to evolve intelligence without developing any sense of morality.
The most powerful thing in the world having no morals seems potentially worrisome. This immediately brings up the dystopian scenes from The Matrix, where countless humans are kept in little cells so an ASI can harvest energy from their bodies.
But just because we developed morality through evolution doesn't mean that an ASI wouldn't develop something similar through some other path. LLMs are trained on massive corpora of human-generated data. Perhaps this training data will form the backbone of the ASI’s morality and it will develop morals similar to human morals.
This might be the best-case scenario for us. When we think of morality, we think of human morality. Humans prioritize the well-being of humans far above and beyond other beings. We place a significant discount on people unknown to us. Below humans are our pets and other things that look cute to us. We strive to make sure things that look cute don't suffer. Below that are things that don't look cute; their suffering counts for significantly less.
Or perhaps ASI will develop morality through pure reason. Some argue that this is impossible - Hume’s guillotine states that there is a divide between “is” statements and “ought” statements and if you only know “is” statements you can never make an “ought” statement. For example, imagine there is a train barreling down the tracks toward a bunch of people. There is a lever you could flip to derail the train and save the people. Ought you to flip the lever? Hume argues that in a world with only “is” statements, you cannot answer the “ought” question. However, if you add a single ought - humans ought to take care of each other - then that opens up the “ought” side of the divide and you know you ought to flip the lever.
I certainly don’t know if morality could be developed this way. But given our uncertainty about how morality is constructed in the brain, I don’t see how we can rule this out. We might have some good intuitions about when it will or will not appear, but I don't think we should be too confident in our conclusions. Somewhere, somehow in the unemotional, unintelligent, and unconscious atoms in our brains, morality came to exist. So who's to say when they would appear in an AGI?
So would this morality from pure reason be some perfect and universal form of morality? If so, there might be bad news for us. It wouldn’t look anything like ours. From a human perspective, there is a meaningful line between the intelligence and consciousness of humans and chickens. To us, that line is crystal clear. I’m not sure an ASI would see that distinction as meaningful. If you ask ASI to decrease suffering it will almost assuredly end factory farming. There’s no way it’s going to say, “I will do everything I can to help humans because humans are conscious beings. But, FUCK CHICKENS! I hate chickens. Let’s throw billions of them in dark cages so more intelligent beings can feast on their flesh.”
I don’t think we’re ready for a highly moral ASI. I don’t think we’re ready for judgment day. I would like to tell it that the arc of the moral universe is long, but it bends towards justice. Give us another few hundred years. I'm not sure where we'll be in a few hundred years but 1. I do think we'll be in a better place and 2. at least I'll be dead by then.
The truth is, if we want to continue our current lifestyles, we'd better hope an ASI would be willing to throw billions of conscious beings into dark cages for the satisfaction of a more intelligent species. Just, the right ones.
"Indeed I tremble for my country when I reflect that God is just: that his justice cannot sleep for ever."
-- Thomas Jefferson