A Prodigy of Refutation
My Childhood Death Spiral described the core momentum carrying me into my mistake, an affective death spiral around something that Eliezer1996 called “intelligence.” I was also a technophile, pre-allergized against fearing the future. And I’d read a lot of science fiction built around personhood ethics—in which fear of the Alien puts humanity-at-large in the position of the bad guys, mistreating aliens or sentient AIs because they “aren’t human.”
That’s part of the ethos you acquire from science fiction—to define your in-group, your tribe, appropriately broadly. Hence my email address, firstname.lastname@example.org.
So Eliezer1996 is out to build superintelligence, for the good of humanity and all sentient life.
At first, I think, the question of whether a superintelligence will/could be good/evil didn’t really occur to me as a separate topic of discussion. Just the standard intuition of, “Surely no supermind would be stupid enough to turn the galaxy into paperclips; surely, being so intelligent, it will also know what’s right far better than a human being could.”
Until I introduced myself and my quest to a transhumanist mailing list, and got back responses along the general lines of (from memory):
Morality is arbitrary—if you say that something is good or bad, you can’t be right or wrong about that. A superintelligence would form its own morality.
Everyone ultimately looks after their own self-interest. A superintelligence would be no different; it would just seize all the resources.
Personally, I’m a human, so I’m in favor of humans, not Artificial Intelligences. I don’t think we should develop this technology. Instead we should develop the technology to upload humans first.
No one should develop an AI without a control system that watches it and makes sure it can’t do anything bad.
Well, that’s all obviously wrong, thinks Eliezer1996, and he proceeded to kick his opponents’ arguments to pieces. (I’ve mostly done this in other essays, and anything remaining is left as an exercise to the reader.)
It’s not that Eliezer1996 explicitly reasoned, “The world’s stupidest man says the Sun is shining, therefore it is dark out.” But Eliezer1996 was a Traditional Rationalist; he had been inculcated with the metaphor of science as a fair fight between sides who take on different positions, stripped of mere violence and other such exercises of political muscle, so that, ideally, the side with the best arguments can win.
It’s easier to say where someone else’s argument is wrong, then to get the fact of the matter right; and Eliezer1996 was very skilled at finding flaws. (So am I. It’s not as if you can solve the danger of that power by refusing to care about flaws.) From Eliezer1996’s perspective, it seemed to him that his chosen side was winning the fight—that he was formulating better arguments than his opponents—so why would he switch sides?
Therefore is it written: “Because this world contains many whose grasp of rationality is abysmal, beginning students of rationality win arguments and acquire an exaggerated view of their own abilities. But it is useless to be superior: Life is not graded on a curve. The best physicist in ancient Greece could not calculate the path of a falling apple. There is no guarantee that adequacy is possible given your hardest effort; therefore spare no thought for whether others are doing worse.”
You cannot rely on anyone else to argue you out of your mistakes; you cannot rely on anyone else to save you; you and only you are obligated to find the flaws in your positions; if you put that burden down, don’t expect anyone else to pick it up. And I wonder if that advice will turn out not to help most people, until they’ve personally blown off their own foot, saying to themselves all the while, correctly, “Clearly I’m winning this argument.”
Today I try not to take any human being as my opponent. That just leads to overconfidence. It is Nature that I am facing off against, who does not match Her problems to your skill, who is not obliged to offer you a fair chance to win in return for a diligent effort, who does not care if you are the best who ever lived, if you are not good enough.
But return to 1996. Eliezer1996 is going with the basic intuition of “Surely a superintelligence will know better than we could what is right,” and offhandedly knocking down various arguments brought against his position. He was skillful in that way, you see. He even had a personal philosophy of why it was wise to look for flaws in things, and so on.
I don’t mean to say it as an excuse, that no one who argued against Eliezer1996 actually presented him with the dissolution of the mystery—the full reduction of morality that analyzes all his cognitive processes debating “morality,” a step-by-step walkthrough of the algorithms that make morality feel to him like a fact. Consider it rather as an indictment, a measure of Eliezer1996’s level, that he would have needed the full solution given to him, in order to present him with an argument that he could not refute.
The few philosophers present did not extract him from his difficulties. It’s not as if a philosopher will say, “Sorry, morality is understood, it is a settled issue in cognitive science and philosophy, and your viewpoint is simply wrong.” The nature of morality is still an open question in philosophy; the debate is still going on. A philosopher will feel obligated to present you with a list of classic arguments on all sides—most of which Eliezer1996 is quite intelligent enough to knock down, and so he concludes that philosophy is a wasteland.
But wait. It gets worse.
I don’t recall exactly when—it might have been 1997—but the younger me, let’s call him Eliezer1997, set out to argue inescapably that creating superintelligence is the right thing to do.