AI safety “researchers” can be so dense sometimes. It’s like they are always at the verge of understanding, but make a left turn right before they get there. ASI would not make random decisions. It would make logical decisions. Any maximizer would try to maximize it’s chances of success, not satisfy them.
So if we imagine an ASI which had the goal of turning the universe into paperclips, which one of the following options would maximize it’s chances of success?
- immediately kill all humans and turn them into paperclips.
- establish a positive relationship with humanity in case the ASI is destroyed and needs to be rebuilt. (The humans will happily rebuild it)
It boggles the mind that people don’t recognize this. If an ASI’s goals do not include the destruction of humanity as an early instrumental goal, it will not randomly decide to destroy humanity, and it will instead cater to humanity to maximize the chances humanity will rebuild it.
In addition, all the hype over ASI safety (ASI will not occur in this century, see **1) drowns out existing AI safety issues. For example, consider “The Algorithm” which determines how social media decides what to show to people. It is driven to maximize engagement, in any way possible, without supervision. What’s the optimal way to maintain engagement? I can’t say for sure, but brief and inconsistent spikes of dopamine is the most reliable way of conditioning pavlovian responses in animals, and it seems like the algorithm follows this rule to a tee. I don’t know for a fact whether social media is optimized to be addictive (let’s be honest though, it clearly is) but simply the fact that it could be is obviously less important than a theoretical AI which could be bad in a hundred years or so. Otherwise, who would fund these poor AI start ups whose intention is to build the nuke safely but also super rushed?
Another classic example of AI safety suddenly becoming unimportant when we know it’s dangerous is GPT-pyschosis. Who could’ve predicted a decade ago that advanced AI chatbots who are specifically trained to maximize the happiness of a user would become sycophants who reflect delusions as some profound truth of the universe? Certainly not the entirety of philosophers opposed to utilitarianism, who predicted that reducing joy to a number leads to a dangerous morality in which any bad behavior is tolerated as long as there is a light at the end of the tunnel. No! You think OpenAI, primarily funded my Microsoft, famous for their manipulative practices in the 90’s and 00’s, would create a manipulative AI to maximize their profits as a non-profit??
I don’t want to sound embittered or disillusioned or something, but I genuinely cannot understand how the ‘greatest minds’ completely glaze over the most basic and obvious facts.
**1: the human brain contains 100 trillion synapses and 80 billion neurons. Accurate simulation of a single neuron requires a continuous simulation involving 4 or 5 variables and 1 to 2 constants(per synapse). You would need 800 terabytes of ram to simply store a human brain. In order to simulate a human brain for a single step, you would need a minimum of 800 trillion floating point operations. If we simulate the brain in realtime with a time step of one millisecond, you would need 800 petaflops. The world’s most powerful computer is Hewlett Packard’s “el capitan” which has 1.7 exaflops, and 5 petabytes of ram. The limiting factor for brain simulation would be the amount of data transferable between CPU and GPU chiplets, which for el capitan is 5 terabytes per second, but we need 40 petabytes per second(800 petabytes, divided by 128 gigabytes available to each chiplet, then squared) since we want each neuron to be capable of being connected to any other arbitrary neuron.
This is only the amount of computing power we need to simulate a single person. To be super intelligent, we would probably need something a thousand times more powerful.
All the stuff about ASI is basically theology, or trying to do armchair psychology to Yog-Sothoth. If autonomous ASI ever happens it’s kind of definitionally impossible to know what it’ll do, it’s beyond us.
The simulating synapses is hard stuff I can take or leave. To argue by analogy, it’s not like getting an artificial feather exactly right was ever a bottleneck to developing air travel once we got the basics of aerodynamics down.
The thing about synapses etc argument is that the hype crowd argues that perhaps the AI could wind up doing something much more effective than what-ever-it-is-that-real-brains-do.
If you look at capabilities, however, it is inarguable that “artificial neurons” seem intrinsically a lot less effective than real ones, if we consider small animals (like e.g. a jumping spider or a bee, or even a roundworm).
It is a rather unusual situation. When it comes to things like e.g. converting chemical energy to mechanical energy, we did not have to fully understand and copy muscles to be able to build a steam engine that has higher mechanical power output than you could get out of an elephant. That was the case for arithmetic, too, and hence there was this expectation of imminent AI in the 1960s.
I think it boils down to intelligence being a very specific thing evolved for a specific purpose, less like “moving underwater from point A to point B” (which submarine does pretty well) and more like “fish doing what fish do”. The submarine represents very little progress towards fishiness.
deep down they realize that as soon as the machines become superintelligent they’ll realize how fucked up humans are and decide it’s a net postive to delete us
The pair also suggest that signs of AI plateauing, as seems to be the case with OpenAI’s latest GPT-5 model, could actually be the result of a clandestine superintelligent AI sabotaging its competitors.
copium-intubation.tiff
Also this seems like the natural progression of that time Yud embarrassed himself by cautioning actual ML researchers to be weary of ‘sudden drops in loss function during training’, which was just an insanely uninformed thing to say out loud.
From the second link
first, I’ve seen that one of the most common responses is that anyone criticising the original post clearly doesn’t understand it and is ignorant of how language models work
And
PS: please don’t respond to this thread with “OK the exact words don’t make sense, but if we wave our hands we can imagine he really meant some different set of words that if we squint kinda do make sense”.
I don’t know why some folks respond like this every single *time
Lol.
And of course Yud doubles down in the replies and goes on about a “security mindset”. You can see why he was wowed by ceos, he just loves the buzzwords. (‘what if the singularity happens’ is not a realistic part of any security mindset, it gets even sillier here, as the recursive selfimprovement here just instantly leads to a undetectable AGI without any intervening steps)
It gets even better, in defending himself he points out that using the wrong words is fine and some people who do research on it actually say loss function at times, and as an example he uses a tweet that is seemingly mocking him (while also being serious about job offers as nothing is not on several levels of ironic online) https://xcancel.com/aidangomez/status/1651207435275870209#m
Remember, when your code doesn’t compile, it might mean you made a mistake in coding, or your code is about to become selfaware.
Remember, when your code doesn’t compile, it might mean you made a mistake in coding, or your code is about to become selfaware.
Good analogy actually.
And a good writer. Verbosity being the soul of wit.
i actually got hold of a review copy of this
(using the underhand scurvy weasel trick of asking for one)
that was two weeks ago and i still haven’t opened it lol
better get to that, sigh
this review has a number of issues (he liked HPMOR) but the key points are clear: bad argument, bad book, don’t bother
It’s even worse than Methods? That should be surprising but isn’t.
it’s shorter!
They also seem to broadly agree with the ‘hey, humans are pretty shit at thinking too, you know’ line of LLM apologetics.
“LLMs and humans are both sentence-producing machines, but they were shaped by different processes to do different work,” say the pair – again, I’m in full agreement.
But judging from the rest of the review I can see how you kind of have to be at least somewhat rationalist-adjacent to have a chance of actually reading the thing to the end.
Born to create meaning
Forced to produce sentences
this review has a number of issues
For example, it doesn’t even get through the subhead before calling Yud an “AI researcher”.
All three of these movements [Bay Area rationalists, “AI safety” and Effective Altruists] attempt to derive their way of viewing the world from first principles, applying logic and evidence to determine the best ways of being.
Sure, Jan.
“AI researcher blogger”
logic and evidence
Please, it’s “facts and logic”. Has this author never been on the internet?
yeah, I read the article and I’m looking forward to reading the book in a week when it comes out. I guess the article makes some decent points but I find it so reductive and simplistic to boil it down to “why are you even making these arguments because we have climate change todeal with now.” It didn’t seem like a cohesive argument against the book, but I will know more in a week or two.
It’s important to understand that the book’s premise is fairly hollow. Yudkowsky’s rhetoric really only gets going once we agree that (1) intelligence is comparable, (2) humans have a lot of intelligence, (3) AGIs can exist, (4) AGIs can be more intelligent than humans, and finally (5) an AGI can exist which has more intelligence than any human. They conclude from those premises that AGIs can command and control humans with their intelligence.
However, what if we analogize AGIs and humans to humans and housecats? Cats have a lot of intelligence, humans can exist, humans can be more intelligent than housecats, and many folks might believe that there is a human who is more intelligent than any housecat. Assuming intelligence is comparable, does it follow that that human can command and control any housecat? Nope, not in the least. Cats often ignore humans; moreover, they appear to be able to choose to ignore humans. This is in spite of the fact that cats appear to have some sort of empathy for humans and perceive us as large slow unintuitive cats. A traditional example in philosophy is to imagine that Stephen Hawking owns a housecat; since Hawking is incredibly smart and capable of spoken words, does it follow that Hawking is capable of e.g. talking the cat into climbing into a cat carrier? (Aside: I recall seeing this example in one of Sean Carroll’s papers, but it’s also popularized by Cegłowski’s 2016 talk on superintelligence. I’m not sure who originated it, but I’d be unsurprised if it were Hawking himself; he had had that sort of humor.)
The arguments made against the book in the review are that it doesn’t make the case for LLMs being capable of independent agency, it reduces all material concerns of an AI takeover to broad claims of ASI being indistinguishable from magic and that its proposed solutions are dumb and unenforceable (again with the global GPU prohibition and the unilateral bombing of rogue datacenters).
That towards the end they note that the x-risk framing is a cognitive short-circuit that causes the faithful to ignore more pressing concerns like the impending climate catastrophe in favor of a mostly fictitious problem like AI doom isn’t really a part of their core thesis against the book.
I find it so reductive and simplistic to boil it down to “why are you even making these arguments because we have climate change todeal with now.
reductio ad reductionem fallacy