import matplotlib.pyplot as plt
import numpy as np
% matplotlib inline
In texts influenced by those who fear runaway AGI and/or Ray Kurzweil, we often read about the amazing consequences of exponential growth, recursive growth etc. when it comes to the intelligence explosion.
In part I of this post I would like to explore what happens when you apply the same math to problem complexity, and make an argument on why human level AI might not be able to recursively enhance in the way those who fear runaway AGI often sketch.
In part II I will try to give some consequences on this, but I'm not really happy with it. I'll share it anyway, and iterate through it as feedback comes in.
I present one of my arguments why I think AGI risk is a red herring for smart people wanting to save the world. Big-$\mathcal{O}$ eats exponential computing power growth for lunch, so even assuming it will magically continue till the theoretical limit, an AGI might not be the superintelligence god it is sometimes sketched to be. At the very least, I think this warrants a very skeptical approach to the very idea of superintelligence.
My assumptions/simplifications:
So let's say between current intelligences are around 1.5e13 and 3.4e16.
How far does this get us in one of the most famous NP-hard problems, the traveling salesman problem? (assuming 1 FLOP is what we need to solve a problem of size 1...constant factors vanish under O notation, and we will get back to this):
Complexities for TSP range from $O(n!)$ (exact naive solution), over $O(n^22^n)$(held-karp algorithm for exact solution) to $O(n^3)$(Christofides algorithm),$O(n^2log_2n)$ (greedy search), to $O(n^2)$ (insertion heuristic). 1 2. Let's see how many data processing events could happen there depending on different assumptions, already assuming that the cerebellum does nothing at all for intelligence,i.e. all embodied computing theories are false:
cortex_neurons_min,cortex_neurons_max=21e9,26e9
cortex_synapses = 1.5e14
connectivity_min,connectivity_max=cortex_synapses/cortex_neurons_min,cortex_synapses/cortex_neurons_max
spike_rate_min=0.1
spike_rate_max= 2
flop_opt_min=cortex_neurons_min*spike_rate_min*connectivity_min
print("Optimistic estimate for the FLOPS of the brain is :{:.2}".format(flop_opt_min))
Optimistic estimate for the FLOPS of the brain is :1.5e+13
from scipy.special import factorial
def naive(n):
return factorial(n)
naive=np.vectorize(naive)
def held_karp(n):
return n**2*2**n
held_karp=np.vectorize(held_karp)
def greedy(n):
return n**2*np.log2(n)
greedy=np.vectorize(greedy)
def christofides(n):
return n**3
christofides=np.vectorize(christofides)
def insertion_heuristic(n):
return n**2
insertion_heuristic=np.vectorize(insertion_heuristic)
n=np.arange(2,1e3,1) # going from 1 to a thousand cities
c_naive=naive(n)
c_hk=held_karp(n)
c_christ=christofides(n)
c_greedy=greedy(n)
c_ins=insertion_heuristic(n)
plt.plot(n,c_naive,label="Naive")
plt.plot(n,c_hk,label="Held Karp")
plt.plot(n,c_greedy,label="Greedy")
plt.plot(n,c_ins,label="Insertion")
plt.plot(n,c_christ,label="Christofides")
plt.plot(n,np.ones_like(n)*1.5e13,label="Opt naive brain estimate")
plt.plot(n,np.ones_like(n)*3.4e16,label="Kurzweil brain estimate ballpark/tihue2")
plt.legend()
<matplotlib.legend.Legend at 0x7f681a2a2b00>
Okay, the plot doesn't show a lot. Normally you would use log-scale, but the scales involved actually break plt.loglog...so yeah.
What we can see is that the naive algorithm exploded very early, the held-karp algorithm at iteration 1000 and the rest is just mushed together at ~0e306. We can at least see the watershed between exact and approximate solutions.
So let's look at the exact part
n=np.arange(2,1e2,1) # going from 1 to 10 million cities
c_naive=naive(n)
c_hk=held_karp(n)
plt.plot(n,c_naive,label="Naive")
plt.plot(n[:len(c_hk)],c_hk,label="Held Karp")
plt.plot(n,np.ones_like(n)*1.5e13,label="Opt naive brain estimate")
plt.plot(n,np.ones_like(n)*3.4e16,label="Kurzweil brain estimate ballpark/tihue2")
plt.yscale("log")
plt.legend()
<matplotlib.legend.Legend at 0x7f681a1caba8>
So, in this very limited region, log scale works again and we can actually see something. Mainly that superpolynomial and factorial growth kills any gains in computation pretty quickly. The Kurzweil estimate is double of ours (a bit more even), but it only shifts us from ~37 to ~43 cities. Can the approximate solutions do better?
n=np.arange(2,1e7,1) # going from 1 to 10 million cities
#c_naive=naive(n)
#c_hk=held_karp(n)
c_christ=christofides(n)
c_greedy=greedy(n)
c_ins=insertion_heuristic(n)
plt.plot(n,c_greedy,label="Greedy")
plt.plot(n,c_ins,label="Insertion")
plt.plot(n,c_christ,label="Christofides")
plt.plot(n,np.ones_like(n)*1.5e13,label="Opt naive brain estimate")
plt.yscale("log")
plt.xscale("log")
plt.legend()
<matplotlib.legend.Legend at 0x7f681a11cb00>
Why yes they CAN! A lot better even. (In the future, you might find a link to a post here where I show how much better we can do if we try to solve things approximately, i.e. satisfice instead of solving things perfectly )
As we can see, in our simplified version, a hypothetical super intelligence which expends one of the "FLOPS" which we loosely defined the brain has 1.5e13 of for a time step could solve a problem with up to 10 million cities in under a second. Which is pretty good actually. The trade-off? The solution is no longer the guaranteed optimum, though probably pretty close. So by becoming "more intelligent" in the sense of being able to handle more cities, the intelligence has also become fallible. In this case, because the algorithm is very specific, it is possible to bound the error, but this might not always be possible. And this brings me to my conjecture:
I.e.,a variant of the no free lunch theorem shows up again and we can have either
but not all at the same time
This is the part I struggled with writing fairly and without any accidental ad hominems, because this is where my aversion to prophets and futurism comes into place. Please don't hesitate to call me out if you think I am not making arguments in good faith or being unfair to people.
So what do I want to say with this? Am I anti AGI risk research? NO. More research > less research. Always. But, in a sense, the main thing that is bothering me is that
In my opinion some arguments that are used when talking about superintelligence now are, in the worst case, actively harmful to the stated goals they are trying to achieve (tackling the most important potential risks of the future).
At best, they are intellectually dissatisfying. They make wild assumptions and simplifications without (in my eyes) sufficient justification, creating a (in my opinion) artificial climate of urgency based pseudo quantitative arguments. As an example, take this quote from Wikipedia referencing Bostrom
Just as the fate of the mountain gorilla depends on human goodwill, so might the fate of humanity depend on the actions of a future machine superintelligence.
or this article also linked on wikipedia, which states:
Our intelligence is ultimately a mechanistic process that happens in the brain, but there is no reason to assume that human intelligence is the only possible form of intelligence. And while the brain is complex, this is partly an artifact of the blind, incremental progress that shaped it - natural selection. This suggests that developing machine intelligence may turn out to be a simpler task than reverse- engineering the entire brain. The brain sets an upper bound on the difficulty of building machine intelligence; work to date in the field of artificial intelligence sets a lower bound; and within that range, it’s highly uncertain exactly how difficult the problem is. We could be 15 years away from the conceptual breakthroughs required, or 50 years away, or more.
So the fact that the brain is an upper bound on the complexity of human intelligence is used to justify the belief it might also place an upper bound on superhuman intelligence. The complexity of the brain which might be necessary for even our limited general intelligence is handwaved as evolution.
Arguments like this hype up our imagination (which is good) with unrealistically optimistic, extrapolating and reductionist science-ish stories (which is bad).If you read Superintelligence, or any of a large subset of MIRI, FRI etc. publications, there is usually (at best) an handwavy presumption that
If you keep hammering those points, which are (in my eyes) not really based in anything substantial, a superintelligence becomes somewhat plausible.
This leads to 80k hours rating AI risk higher than climate change and the ...less consumer friendly sides capitalism!,even though
And if you think I am being melodramatic here, might I remind you of Chinas social score, the governments predator drones or present this example of a robot being used to drive away homeless today even?
I have an aversion against focusing too much on x-risks for reasons which I will explain in another post. However to me the overwhelming risk of things that are already happening by default outweighs the risk for what might happen with a (probably tiny) unknown probability. The framework of
is quite popular in the EA/rationalist/cause prioritization crowd (which happens to also have a higher than average number of people interested in AGI risk). In the case of AGI, I think the silent presumption of AGI power and impact leads to twisted reasoning, as in the case ofthe open philanthropy foundation as well as the giving what we can foundation and 80khours. They all use this framework to classify political lobbying, technical innovation and direct carbon reduction as a worthwhile goal with uncertain return. This is motivated amongst other things because "a lot" of people are already working on it, political breakthroughs seem to have low probability and any additional activism might have diminishing returns. Fair enough.
But AGI is deemed as receveing little attention from researchers in relation to the risk,with 80k hours giving a neglectedness of 14 vs climate changes 2 and an AGI "scale" of 15 vs climate changes 14!
Given the point made about underestimating climate change, that we are far away from a stable global commitment of working against climate change this is preposterous to me.
Relative to proven risk, not that much effort gets spent on climate change vs AGI. And the argument of small probability-high impact works for political solutions for climate change just as well as for AGI.
For organisations like 80k hours which have significant influence about peoples decisions purporting to attempt rational and evidence based guidance, not taking a conservative position and at least ranking climate change higher than AGI until we know more seems a bit irresponsible. It is a bit like advising people to buy lottery tickets when they ask you how to invest.
And in fact, you can argue it is irresponsible no matter what bias you bring (alarmism vs. skepticism) because if one believes that AGI will actually become a problem, acting like it could happen really soon each year without any evidence will backfire after a while. Unless the worst case scenario (for AGI risk pessimists) happens and we have an unexpected runaway AI incident in the next decade or so, people will move on.The next hype cycle will start and we will have to re-fight the battle for public interest, not unlike climate change had to do even with overwhelming evidence. So let's be better about the relative urgency and the impact that things will have.
Alternatively, if we really want to focus on the black swans, then I think we should not separate between technological AGI and what I have started to call "corporate intelligence". The basic idea was touched in Meditations on Moloch, but taken much further in the 34C3 keynote "dude, you broke the future". The basic idea is that one should treat capitalism/governance as an existing AGI, mainly a distributed reinforcement learning algorithm currently optimising for "cash" or "growth" as a proxy for human welfare.
I will write another blogpost on this to elaborate, but if and when I see this idea taken seriously as an incarnation of superintelligence I will immediately seize my criticism. Because this stuff exists, and doesn't need any handwaving and extrapolating from past progress.
This part is purely emotional, so please, if I'm wrong, reassure me. I want to be wrong on this.
If someone still wants to do research on AGI, cool, as long as the funds arguments that lead them to it are either personal or the arguments are intellectually honest and rigorous, cool(heck, I am working on AI instead of doing direct action against climate change. But I also don't purport to dedicate my existence to making sure efforts are directed 100% efficiently).I believe AI and the way it will influence our society is cool, and it is important.
But I also think AGI right now is kind of a red herring, a shiny project smart people can solve amongst themselves by being smart instead of having to deal with activism. The artificially inflated risk of AGI, the bias to the perfect AI overlord distorts the discourse, gives justification to pursuing our interests (which we shouldn't need) and distracts from other, also maybe slightly less cool and important issues:
That's the threats I see. So instead of acting like AGI is on the horizon risk, let's start the research knowing that we might never need it, just because it is interesting. Let's stop using the "runaway AI might be closer than you think and will be devastating" argument for fund raising though and leave the doomsday money to stuff which might actually kill us in the next 50 years. And let's start including ANI risk in every text we write to raise awareness about AI risk, so we avoid THAT dystopia.
For comments, corrections or angry counter rants, please do reach out via email or mastodon.
#7.5e10*(1.6)**x =6e19
#(1.6)**x =6e19/7.5e10
x=np.log(6e19/7.5e10)/np.log(1.6)
x
43.61694465749319
Huh. ~45 years. Better hope we get to make efficient use of all that compute then. At least I'll have a job until retirement age if I manage to contribute to making this possible :-)