Hide table of contents

Outline: I argue that interventions which affect the relative probabilities of humanity's long-term scenarios have much higher impact than all other interventions. I discuss some possible long-term scenarios and give a high-level classification of interventions.

Background

It is common knowledge in the Effective Altruism movement that different interventions often have vastly different marginal utility (per dollar or per some other unit of invested effort). Therefore, one of the most important challenges in maximizing impact is identifying interventions with marginal utility as high as possible. In the current post, I attack this challenge in the broadest possible scope: taking into account impact along the entire timeline of the future.

One of the first questions that arise in this problem is the relative importance of short-term versus long-term impact. A detailed analysis of this question is outside the scope of the current post. I have argued elsewhere that updateless decision theory and Tegmark's mathematical universe hypothesis imply a time discount falling much slower than exponentially and only slightly faster than [time since Big Bang]-1. This means that the timescale on which time discount becomes significant (at least about 14 billion years from today's standpoint) is much larger than the age of the human species, favoring interventions focused on the far long-term.

The most evident factor affecting humanity's welfare in the long-term is scientific and technological progress. Progress has drastically transformed human society, increased life expectancy, life quality and total human population. The industrial revolution in particular has created a world in which the majority of people in developed countries enjoy a lifestyle of incredible bounty and luxury compared to the centuries which came before. Progress continues to advance in enormous steps, with total eradication of disease and death and full automation of labor required for a comfortable lifestyle being realistic prospects for the coming centuries.

It might appear reasonable to conclude that the focus of long-term interventions has to be advancing progress as fast as possible. Such a conclusion would be warranted if progress was entirely 1-dimensional or at least possessing only one possible asymptotic trajectory in the far future. However, this is almost certainly not the case. Instead, there is a number of conceivable asymptotic trajectories (henceforth called "future scenarios") with vastly different utility. Hence, interventions aiming to speed up progress appear much less valuable than interventions aiming to modify the relative probabilities of different scenarios. For example, it is very difficult to imagine even a lifetime of effort by the most suitably skilled person speeding up progress by more than 100 years. On the other hand it is conceivable that a comparable effort can lead to changing scenario probabilities by 1%. The value of the former intervention can be roughly quantified as 102 late-humanity-years whereas the value of the latter intervention is at least of the order of magnitude of (14 billion x 1% = 1.4 x 108) late-humanity-years.

Future Scenarios

A precise description of scenario space is probably impossible at the current level of knowledge. Indeed, improving our understanding of this space is one type of intervention I will discuss in the following. In this post I don't even pretend to give a full classification of scenarios that are possible as far as we can know today. Instead, I only list the examples that currently seem to me to be the most important in order to give some idea of how scenario space might look like.

Some of the scenarios I discuss cannot coexist as real physical possibilities since they rely on mutually contradictory assumptions on the feasibility of artificially creating and/or manipulating intelligence. Nevertheless, they all seem to be valid possibilities given our current state of knowledge the way I see it (other people have higher confidence than myself regarding the aforementioned assumptions). Also, there seems to be no set of reasonable assumptions under which only scenario is physically possible.

I call "dystopia" those scenarios in which I'm not sure I would want to wake up from cryonic suspension and "utopia" the other scenarios (the future is likely to be so different from the present that it will appear to be either horrible or amazing in comparison). This distinction is not of fundamental importance: instead, our decisions should be guided by the relative value of different scenarios. Also, some scenarios contain residual free parameters (scenario space moduli, so to speak) which affect their relative value with respect to other scenarios.

Dystopian Scenarios

Total Extinction

No intelligent entities remain which are descendant from humanity in any sense. Possible causes include global thermonuclear war, bioengineered pandemic, uncontrollable nanotechnology and natural disasters such as asteroid impact. The last cause, however, seems unlikely since frequency of such events is low and defenses will probably be ready long before time.

Unfriendly Artificial Intelligence

According to a hypothesis known as "AI foom", self-improving artificial intelligence will reach a critical point in its development (somewhere below human intelligence) at which its intelligence growth will become so rapid that it quickly crosses into superintelligence and becomes smarter than all other coexisting intelligent entities put together. The fate of the future will thus hinge on the goal system programmed into this "singleton". If the goal system was not designed with safety in mind (a highly non-trivial challenge known as friendly AI), the resulting AI is likely to wipe out the human race. The AI itself is likely to proceed with colonizing the universe, creating a future possibly more valuable than inanimate nature but still highly dystopian1.

Superdictatorship

A single person or a small group of people gains absolute power over the rest of humanity and proceeds to abuse this power. This may come about in a number of ways, for example: 

  • Dictators enhance their own intelligence. The risks of this process may lead to extremely immoral posthumans even if the initial persons were moral.
  • Creation of superintelligences that are completely loyal to the dictators. These superintelligences can be AI or enhanced humans. This scenario requires the malevolent group to solve the "friendliness" problem (maintaining a stable goal system through a process of extreme intelligence growth).
  • Use of nanotechnology to forcibly upload the rest of humanity into a computer simulation where they are at the mercy of the dictators.
  • Some sort of technology for complete mind control, e.g. involving genetically reprogramming humanity using a retrovirus.

The risk of these scenarios is elevated by concentration of resources and technological capacity in the hands of authoritarian governments.

Unhumanity

A large number of people undergo a sequence of mind modifications that make them more intelligent and economically competitive but lose important human qualities (e.g. love, compassion, curiosity, humor). The gradual nature of the process creates an unalarming appearance since the participants consider only the next step at any given moment instead of considering the ultimate result. The resulting posthumans use their superior intelligence and economic power to wipe out the remaining unmodified or weakly modified people. The value of this scenario can be as low as the UFAI scenario or somewhat higher, depending on the specifics of the mind modifications.

Utopian Scenarios

Friendly Artificial Intelligence

The AI foom singleton is imbued with a goal system very close to human values, possibly along the lines of Coherent Extrapolated Volition or the values of the specific person or group of persons from whose point of view we examine the desirability of scenarios. This is probably the most utopian scenario since it involves an immensely powerful superintelligence working towards creating the best universe possible. It is difficult to know the details of the resulting future (although there have been some speculations) but it is guaranteed to be highly valuable.

Emulation Republic

All people exist as whole brain emulations or modified versions thereof. Each star system has a single government based on some form of popular sovereignty.

Non-consensual physical violence doesn't exist since it is impossible to invade someone's virtual space without her permission and shared virtual spaces follow guaranteed rules of interaction. A fully automated infrastructure in which everyone are shareholders allows living comfortably without the necessity of labor. Disease is irrelevant, immortality is a given (in the sense of extremely long life; the heat death of the universe might still pose a problem). People choose their own pseudo-physical form in virtual spaces for which reason physical appearance is not a factor in social rank, gender assignment at birth causes few problems and racism in the modern sense is a non-issue.

People are free to create whatever virtual spaces they want, within the (weak) resource constraints as long as they don't contain morally significant entities. Brain emulations without full citizen status are forbidden up to allowances for raising children. Cloning oneself is allowed but making children is subject to regulation for the child's benefit.

Access to the physical layer of reality is strictly regulated. It is allowed only for pragmatic reasons such as scientific research with the goal of extending the current civilization lifespan even more. All requests for access are reviewed by many people, only minimal necessary access is approved and the process is monitored in real-time. By these means, the threat of malevolent groups breaking the system through the physical layer is neutralized.

Superrational Posthumanity

Human intelligence is modified to be much more superrational. This effectively solves all coordination problems, removing the need in government as we understand it today. This scenario assumes strong modification of human intelligence is feasible which is a weaker assumption than the ability to create de novo AI but a stronger assumption than the ability to create whole brain emulations.

Other Scenarios

There are scenarios which are difficult to classify as "dystopian" and "utopian" due to strong effect of certain parameters and different imaginable "cryonic wake up" scenarios. Such scenarios can be constructed by mixing dystopian and utopian scenarios. This includes scenarios with several classes of people (e.g. free citizens and slaves, the latter existing as emulations for the entertainment of their masters) and scenarios with several "species" of people (people with differently modified minds).

Intervention Types

I distinguish between 4 types of interventions with long-term impact. The types of intervention available in practice depend on the point in which you are located on the progress timeline, with type I interventions available virtually always and type IV interventions available only next to a progress branching point. I give some examples of existing programmes within those categories, but the list is intended to be far from exhaustive. In fact, I would be glad if the readers suggest more examples and discuss their relative marginal utility.

Type I: Sociocultural Intervention

These are interventions that aim to "raise the sanity waterline" (improve the average rationality of mankind, with higher weight on more influential people) and/or improve the morality of human cultures. The latter is to be regarded from the point of view of the person or group doing the intervention. These interventions don't assume a specific model of long-term scenarios, instead striving to maximize the chance that humanity chooses the right path when it reaches the crossroads.

Example of a type I interventions include CFAR and the EA movement. Other examples might include educational programmes, atheist movements and human right movements.

Type II: Futures Research

These are interventions that aim to improve our understanding of the possible future scenarios, their relative value and the factors influencing their relative probability. They assume the current state of progress is sufficiently advanced to make discussion of future scenarios relevant. For example, in 1915 nobody would be able to envision whole brain emulation, AI or nanotechnology.

Examples include FHI, CSERFLI and GCRI.

Type III: Selective Progress

These are interventions that try to accelerate progress in selected areas with the aim of increasing probability of desirable scenarios. They assume the current understanding of future scenarios is sufficiently advanced to know the dependence of the relative probabilities of future scenarios on progress in different areas.

One example is MIRI who try to accelerate progress in AGI safety relatively to progress in AGI in general. Other possible examples would be research programmes studying defense against bioengineered pandemics or nanoreplicators.

Type IV: Scenario Execution

These are interventions that aim at direct realization of a specific scenarios. They assume the relevant technology already exists.

As far as I know, such interventions are still impossible today. Theoretical examples include an FAI construction project or a defense system against bioengineered pandemics.

Summary

Long-term thinking leads to somewhat counterintuitive conclusions regarding the most effective interventions. Interventions aiming to promote scientific and technological progress are not necessarily beneficial and can even be harmful. Effective interventions are focused on changing culture, improving our understanding of the future and accelerating progress in highly selected areas.

Many questions remain, for example:

  • What is the importance of cultural interventions in first world versus third world countries?
  • Progress in which areas is beneficial / harmful, to the extent of our current ability to predict?
  • What are the relative marginal utilities of existing programmes in the 4 categories above?

 

1 There is a possibility that the UFAI will bargain acausally with a FAI in a different Everett branch, resulting in a Utopia. However, there is still an enormous incentive to increase the probability of the FAI scenario with respect to the UFAI scenario.

6

0
0

Reactions

0
0

More posts like this

Comments14
Sorted by Click to highlight new comments since: Today at 1:34 PM

Great thought provoking post, which raises many questions.

My main concern is perhaps due to the limitations of my personal psychology: I cannot help but heavily prioritize present suffering over future suffering. I heard many arguments why this is wrong, and use very similar arguments when faced with those who claim that "charity begins at home". Nevertheless, the compassion I have for people and animals in great suffering overrides my fear of a dystopian future. Rational risk / reward assessments leave me unconvinced (oh, why am I not a superintelligent droid). Your post does offer me some comfort, despite my (possible) limitation. Cultivating generosity and compassion within me, and within my society, could be classified as "cultural change" and so might be a highly effective intervention. However, then the question becomes if the most effective ways to achieve this "cultural change" have anything to do with helping those in dire need today. Many attest that mediation and prayer improve their ability to be kind and loving, and I am one of those who are skeptical as to the effects of that on the life expectancy of infants in Africa.

My second concern is that you may be putting too much emphasis on the "human race". In the long-run, why is it bad if our race is superseded by more advanced life forms? Some of your scenarios do envision a human existence that can arguably be classified as "the next evolutionary step" (i.e. whole brain emulations), but their lives and interests still seem closely aligned to those of human beings. Significantly, if the transition from the current world to "Friendly Artificial Intelligence" or to "Unfriendly Artificial Intelligence" involves an equal amount of suffering, the end result seems equally good to me. After all, who is to say that our AI God doesn't wipe out the human race to make room for a universe full of sentient beings that are thousands of times more well off than we could ever be?

Hi Uri, thanks for the thoughtful reply!

It is not necessarily bad for future sentients to be different. However, it is bad for them to be devoid of properties that make humans morally valuable (love, friendship, compassion, humor, curiosity, appreciation of beauty...). The only definition of "good" that makes sense to me is "things I want to happen" and I definitely don't want a universe empty of love. A random UFAI is likely to have none of the above properties.

For the sake of argument I will start with your definition of good and add that what I want to happen is for all sentient beings to be free from suffering, or for all sentient beings to be happy (personally I don't see a distinction between these two propositions, but that is a topic for another discussion).

Being general in this way allows me to let go of my attachment to specific human qualities I think are valuable. Considering how different most people's values are from my own, and how different my needs are from Julie's (my canine companion), I think our rationality and imagination are too limited for us to know what will be good for more evolved beings in the far future.

A slightly better, though still far from complete, definition of "good" (in my opinion) would run along the line of: "what is happening is what those beings it is happening to want to happen". A future world may be one that is completely devoid of all human value and still be better (morally and in many other ways) than the current world. At least better for the beings living in it. In this way even happiness, or lack of suffering, can be tossed aside as mere human endeavors. John Stuart Mill famously wrote:

"It is better to be a human being dissatisfied than a pig satisfied; better to be Socrates dissatisfied than a fool satisfied. And if the fool, or the pig, is of a different opinion, it is only because they only know their own side of the question."

And compared with the Super-Droids of tomorrow, we are the pigs...

If your only requirement is for all sentient beings to be happy, you should be satisfied with a universe completely devoid of sentient beings. However, I suspect you wouldn't be (?)

Regarding definition of good, it's pointless to argue about definitions. We should only make sure both of us know what each word we use means. So, let's define "koodness(X)" to mean "the extent to which things X wants to happen actually happen" and "gudness" to mean "the extent to which what is happening to all beings is what they want to happen" (although the latter notion requires clarifications: how do we average between the beings? do we take non-existing beings into account? how do we define "happening to X"?)

So, by definition of kood, I want the future world to be kood(Squark). I also want the future world to be gud among other things (that is, gudness is a component of koodness(Squark)).

I disagree with Mill. It is probably better for a human being not become a pig, in the sense that a human being prefers not becoming a pig. However, I'm not at all convinced a pig prefers to become a human being. Certainly, I wouldn't want to become a "Super-Droid" if it comes at a cost of losing my essential human qualities.

Thanks, this is a really important topic and it's a nice overview. Exploring classifications of the kinds of intervention available to us is great.

The two pieces I know of which are closest to this are Nick Bostrom's paper Existential Risk Prevention as Global Priority, and Toby Ord's article The timing of labour aimed at existential risk. Ord's "course setting" is a slightly broader bucket than your "futures research". I wonder if it's a more natural one? If not, where would you put the other course-setting activities (which could include either of those pieces, your article, or my paper on allocating risk mitigation for risks at different times)?

I think that the relative value of the different types of intervention changes according to when the risk is coming: the longer we have before 'crunch time', the better the start of the list looks, and the worse the end looks. This is complicated by uncertainty over how long we may have.

Thx for the feedback and the references!

I think Ord's "coarse setting" is very close to my type II. The activities you mentioned belong to type II inasmuch as they consider specific scenarios or to type I inasmuch as they raise general awareness of the subject.

Regarding relative value vs. time: I absolutely agree! This is part of the point I was trying to make.

Btw, I was somewhat surprised by Ord's assessment of the value of current type III interventions in AI. I have a very different view. In particular, the 25-35 years time window he mentions strikes me as very short due to what Ord calls "serial depth effects". He mentions examples from the business literature on the time scale of several years but I think that the time scale for this type of research is larger by orders of magnitude. AI safety research seems to me similar to fundamental research in science and mathematics: driven mostly by a small pool of extremely skilled individuals, a lot of dependent steps, and thus very difficult to scale up.

I agree that AI safety has some similarities to those fields, but:

  • I guess you may be overestimating the effect of serial depth in those fields. While there is quite a lot of material that builds on other material, there are a lot of different directions that get pushed on simultaneously, too.
  • AI safety as a field is currently tiny. It could absorb many more (extremely skilled) researchers before they started seriously treading on each others' toes by researching the same stuff at the same time.

I think some type III interventions are valuable now, but mostly for their instrumental effects in helping type I and type II, or for helping with scenarios where AI comes surprisingly soon.

I think the distance between our current understanding of AI safety and the required one is of similar order of magnitude to the distance between invention of Dirac sea in 1930 and discovery of asymptotic freedom in non-Abelian gauge theory in 1973. This is 43 years of well-funded research by the top minds of mankind. And that without taking into account the engineering part of the project.

If remaining time frame for solving FAI is 25 years than:

  1. We're probably screwed anyway
  2. We need invest all possible effort into FAI since the tail the probability distribution is probably fast falling

On the other hand, my personal estimate regarding time to human level AI is about 80 years. This is still not that long.

Could you say something about why your subjective probability distribution for the difficulty is so tight? I think it is very hard to predict in advance how difficult these problems are; witness the distribution of solution times for Hilbert's problems.

Even if you're right, I think that says that we should try to quickly get to the point with a serious large programme. It's not clear that the route to that means doing focusing on direct work at the margin now. It will involve some, but mostly because of the instrumental benefits in helping increase the growth of people working on it, and because it's hard to scale up later overnight.

My distribution isn't tight, I'm just saying there is a significant probability of large serial depth. You are right that much of the benefit of current work is "instrumental": interesting results will convince other people to join the effort.

Right now my guess is that a combination of course-setting (Type II?) and some relatively targeted sociocultural intervention (things like movement growth -- perhaps this is better classed with course-setting) are the best activities. But I think Type I and Type III are both at least plausible.

Thanks for the post! Key that we're talking about the far future and developing this debate in a way that is easier to access for the skeptical majority ;-)

I fear that the 'utopian scenarios' are just so far from what most people would consider places with moral value. I fear that we aren't making the effort to properly understand people's value objections to these scenarios, and are writing it off as irrational commitment to what people are familiar with when actually there are some common sense feelings about moral value that are worth digging deeper about.

I'm new to these types of online discussions and would bet that some of you know where these fears / common sense moral objections to the types of scenario under your 'utopia' heading are addressed - can you point me in the right direction please?

Side question - there are many more ways super-dictatorships could arise with existing technology. Although it might not be absolute power it's probably an equilibrium that's hard to recover from (although a unified global state would be interesting in terms of managing wealth inequality and global movement etc.). Do you restrict yourself to looking at absolute power because then we're pretty sure it wont change again?

Hi Tom, thx for commenting!

For me, the meta-point that we should focus on steering into better scenarios was a more important goal of the post than explaining the actual scenarios. The latter serve more as examples / food for thought.

Regarding objections to Utopian scenarios, I can try to address them if you state the objections you have in mind. :)

Regarding dictatorships, I indeed focused on situations that are long-term stable since I'm discussing long-term scenarios. A global dictatorship with existing technology might be possible but I find it hard to believe it can survive for more than a couple of thousand years.

Good points. Thanks. I was actually looking for objections rather than having them. Will illustrate my personal responses if I get time.