Investigating the Long Reflection

Yannick_Muehlhaeuser

Summary

The Long Reflection is a period in which humanity tries to minimize the number of irreversible actions and focuses on moral reflection and working out a macrostrategy for its future.

This post discusses the concept in more detail than done before and collects arguments for and against the Long Reflection.

Claims

beyond a time for philosophy, the LR is also a period in which we need to negotiate a framework for space expansion that minimizes its risks
there is a fundamental paradox at the heart of the Long Reflection: in order to preserve option-value for the future, humanity might have to take measures that themselves risk value lock-in
one of the central challenges will be how to reconcile an atmosphere of open debate and open-mindedness with the enforcement mechanisms that make sure no one defects
the Long Reflection is not likely to happen but it might nevertheless be valuable to investigate it further

Introduction

For a few years now the most prominent answer to ‘What if we successfully avoid existential risk?’ has been Will MacAskill’s concept of the Long Reflection - also restated in Toby Ord’s book The Precipice.[4,5]

The concept has been discussed in several places, but there has not been a deeper investigation of the concept so far. This is the role this post is seeking to fulfil.

Structure of the Long Reflection

Definition

We shall define the Long Reflection here as a period in which humanity tries to minimize the number of irreversible actions and focuses on moral reflection and working out a macrostrategy for its future.

Having reached the state of existential security, humanity will almost certainly be in possession of Transformative Artificial Intelligence (TAI), since it’s hard to imagine reaching that state if TAI is still in the future. This means that solving empirical and scientific questions will be comparatively easy. We can expect that the key obstacle to a coherent macrostrategy will be the negotiation between different interest groups

When to start

The right point to start the Long Reflection is when humanity has reached existential security. Here we define existential security as the state in which existential risk across all time is very low.

Given that the universe will probably be inhabitable for a very long time, this might necessitate reducing existential risk to an extremely low level. If we conservatively assume that humanity only could survive until the end of the last stars in approximately 10^14 years, this would require us to reduce existential risk to 10^(-17) annually to reach a total risk of 0.1%.[1]

A more realistic goal may be to get to a state in which we can be sure existential risk will decline with time in the future and total existential risk will converge to a finite value. This way we can establish a lower bound for total existential risk going forward.

An open question is how high the existential risks stemming from unknown unknowns are. It is of course very hard to give examples for risks from unknown unknowns, but an interesting one is how the risk of power outages caused by solar flares would look like to someone from 1500 that doesn’t understand the concept of electricity. Another place where risks from unknown unknowns might be situated is in the realm of complex risks stemming from interactions of many separate agents.

It may be the best option to first attempt to minimize known existential risks, then start a process to explore fundamental science and social science in a way that can identify risk from unknown unknowns. This would likely require humanity to conduct this process while taking appropriate care to handle the possible information hazards that might be encountered.

Another question is how confident we are in our estimate. As estimating total existential risk - especially going forward - will be a challenging task even with advanced AI, there should be continuing investigation of the issue with plans to quickly refocus on x-risk reduction if it is deemed necessary again.

Hard to reverse changes

There are some changes that seem to be truly irreversible, but there is a much broader category of changes that seem reversible in theory, but only at significant cost.

One clear source of irreversibility are the laws of physics. Catastrophes stemming from accidental or deliberate manipulation of them should already be accounted for by our mechanism for ensuring a continued state of existential security.

The most relevant potential irreversible change stemming directly from the laws of physics are mechanisms associated with the cosmic speed limit of the speed of light - contained in the theory of special relativity and confirmed experimentally ever since their discovery.

Due to the cosmic speed limit combined with the expansion of the universe it is possible to send objects at sufficient speed to make them unreachable from earth after a certain period of time.[2] Using self-replicating probes this could give certain actors access to large amounts of matter, space and energy without a possibility of influence by other human descendent actors.

Another thing that is truly irreversible is active Messaging Extraterrestrial Intelligence (METI). This threatens to take away option value in potential future encounters if previous messages have already revealed crucial information or entrenched a certain view of humanity.

Beyond that, powerful agentic AGIs might also be a source of irreversibility. Deployed by choice or by accident, we must assume incorrigible superintelligence has the ability to maintain their internal goals and power against all future attempts to modify them. It is possible to imagine scenarios in which humanity might want to choose this path despite its risk, like if AI sovereigns proves necessary to ensure the absence of military confrontation after space expansion, but this is a choice that should only be made after the Long Reflection.

Another place we might expect irreversibility is consciousness, although this doesn’t rise to the same level of physical impossibility as the previous example. It’s likely that at a certain level of technological progress humanity will have the ability and want to directly modify its consciousness. There may be certain places in mind space that are powerful attractor states that actors will find impossible to leave. This is related to concerns about wireheading in the literature. [9] In general this era is more speculative, as the laws of consciousness are barely known at all compared to our current knowledge of physics.

The danger of self-replicating probes is that even a small object escaping the solar system has the possibility to harness the energy from a galaxy or star system it reaches at exponential speed. If the target system then optimizes for defense, the actor that initiated the processes might be very hard to dislodge. Wars utilizing self-replicating probes on such scale run the risk of converting a significant amount of the universe into more warmachines.[7] If this process is initiated by a malevolent actor this could result in significant amounts of suffering.

Goals

Figure out what is of value: If moral realism is true this may have a single correct answer that we could hope to confirm with a reasonable level of confidence. If morality turns out to be more subjective, we could still imagine humanity gradually converging on a narrower and narrower set of ethical beliefs and reach a reflective equilibrium.

Negotiate a framework to minimize the risks of space expansion: We are currently unsure of how high the risks of space expansion and other wide-ranging, irreversible changes are. Obviously a goal for the Long Reflection would be to answer these questions and design a framework to minimize them going forward.

Learn as much as needed about basic science and technological possibilities: To figure out its moral priorities and design a framework for its future, humanity will probably need an almost complete understanding of basic science and all the technological possibilities.

Important Questions

In this paragraph I want to give a few examples of the questions that we might want to resolve during the Long Reflection. This list is obviously not comprehensive and some of the questions may turn out to resolve itself even before we begin it.

Philosophical Questions

What things are of intrinsic value? Is value in the universe bounded or unbounded?
What are our moral obligations? Do we have an obligation to maximize value in the universe?
If digital consciousness is possible, is there a reason to not prefer this to maintaining human consciousness in a biological substrate?
What does the space of possible minds and consciousness look like and which points are most preferable?
If coordination on crucial issues is not possible, when and by whom should violence be used?
What are the ethics of interacting with Extraterrestrial Intelligence?

Political Questions

What framework can the relevant actors agree on for
- space expansion and use of the cosmic commons
- self-replicating probes
- control of digital consciousness
- potentially dangerous physics experiments
What are desirable cosmic projects and how can we best coordinate to achieve them?
What should our response to contact with Extraterrestrial Intelligence look like?

When to stop reflecting

This is a hard point to establish in advance because future knowledge is hard to predict and we don’t know what further problems we will encounter as we learn more.

The most clear endpoint is when humanity has discoverd a certain moral view or a set of compatible moral views and has attained a high level of confidence in their truth and how to implement them. This may be because moral realism is true and we are confident we figured out all knowable moral facts. Another option is that, while not ontologically “true”, certain views are so intuitive to most humans after a lot of reflection that there is very strong convergence.

If there is no set of compatible views that we converge on, another natural endpoint is when we reach a reflective equilibrium, meaning individuals or coalition do not converge on a moral view, but the different views are compatible. The most obvious next step then is to negotiate for a common framework in which different conceptions of the good can be pursued. It is likely that towards the end of the Long Reflection, humanity will have a better idea how this could be achieved, but in case there is no agreement, there should be a fall back option that all agents have agreed to in advance.

It is of course possible (perhaps even the most likely outcome) that we neither converge on a clear moral truth nor reach reflective equilibrium. This outcome is the one we should prepare for most. Given the fact that for certain value-systems delaying space expansion has a very high cost, we should ensure in advance that the Long Reflection is temporary.

This could of course be done by simply setting a certain time limit in advance, but it seems almost impossible to foresee how long a reflection time is actually appropriate. A superior option could be to supplement this with a metric that measures how much people are updating on new information and the arguments of others - as long as people are still updating it makes sense to continue the reflection period. Another measure could be the amount and type of future expected technological process. We should also make sure some of the key questions (e.g. regarding cosmology, consciousness) are answered.

We could do this by agreeing in advance to compromise in a certain framework. This may either mean dividing the resources between groups according to a certain mechanism or giving control to a coalition that comes out as the winner of a previously agreed voting system.

Who will oppose the Long Reflection

It is likely that some actors or groups will oppose a period of reflection. This may include egoistic actors that only care about their own preferences or potentially fanatical actors that assign a very small probability to them changing their values upon reflection. Some actors may also aim to violently enforce their values, even against the wishes of most others.

There will also likely be opportunist actors that will try to defect before or during the reflection period in hope of gaining access to more resources than they could in a direct competition or after the reflection period. Others may defect at the point where they realize that the process isn’t going the way they had hoped.

Some people may also assign intrinsic value to war and competition and believe that it’s better to decide the future of the universe in that way rather than by reflection and cooperation.

Coordinating the Long Reflection

Given the high probability of opposition and defection, one of the central challenges will be how to reconcile an atmosphere of open debate and open-mindedness with the enforcement mechanisms that make sure no one defects.

An intensive exploration of fundamental science might also bring to light knowledge that could constitute an existential risk if it were to get into the wrong hands. Here we encounter another trade-off between control and freedom.

Actors that believe we will air too far towards enforcement are likely to defect, while actors that think we will lean too far into freedom may seek to unilaterally disempower possibly harmful groups.

Arguments for the Long Reflection

Some argue that our moral values have improved significantly over the past centuries and we have reason to expect that trend to continue. So many actions we take may later be viewed as morally wrong or even catastrophic. This becomes more important the more impactful and harder to reverse our actions get, so it’s necessary to engage in a long period of moral reflection if one wants to minimize that risk.

But even if we expand beyond Earth in a manner that is reversible at a limited cost and start to modify our biology on a fundamental level we may lose for humanity wide reflection and bargaining. If a lot of cooperation is necessary to avoid bad outcomes of space expansion it will be important to ensure that before it is too late.

From this angle we can add a practical argument for a Long Reflection: Space Expansion on a large scale entails risks that could be avoided if we deliberately steer the process in an advantageous direction. But space expansion might no longer be steerable beyond a certain point due to the long distances involved. So we want to make sure we have reflected on the issue sufficiently and ideally have reached some sort of technological maturity, so we know against which technical background we should design our space governance framework.

Surveying s-risks and the risk from bad space governance, it seems like the risks from not engaging in a long reflection are significantly bigger than doing so.

Arguments against the Long Reflection

By blocking all changes that might be irreversible or very hard to reverse, we might severely limit fundamental goals of some individuals. Stopping this change would potentially require a very intrusive ‘world government’ that might not be desirable and might itself carry very large risks.[3] The higher we believe the incentives for defection are, the more severe the enforcement mechanisms have to be and the higher the risks.

The society engaging in a Long Reflection might have to adopt significantly different values and it may come to see this new state as ideal. If a strong majority is required to leave it, the society may continue forever, which for certain value systems constitutes a huge risk.[3]

Others argue that it might be unethical to subject individuals to live during the Long Reflection. Even if it makes the total future better, it might mean people during the Long Reflection are not allowed to do certain things, for instance modife their consciousness in certain ways. [8]

Perhaps the most simple argument against the Long Reflection is that it has an opportunity cost. If you think that the most likely outcome after existential security is reached is not very far from the ideal outcome reached after a longer period of reflection, holding off on important actions may simply not be worth the energy and matter that is lost during the period.

Is the long reflection likely to happen?

The Long Reflection is a coordination problem on a level that has not been seen in any other situation. As humanity is currently struggling with coordination problems, even on much smaller scale, our prior should probably be that something like the Long Reflection is unlikely to happen.

However, there are reasons to think that the probability is not extremely low. Human history until now has been a non-continuous but steady process of increasing coordination. Compared to a 1000 years ago, modern states allow for cooperation on a level that is extremely impressive. If this process continues, the level of global cooperation required for something like the Long Reflection is at least in the realm of possibility.

The world on the verge of humanity reaching existential security would also be radically different from what we have today. Since reaching existential security will very likely entail the possession of TAI that is aligned with human preferences, we might expect that this time will also mark the end of material abundance. Such a world has been described as “Paretotopia”. All agents suddenly have a lot more to gain since there are so many resources to divide up.

While those arguments don’t strike me as strong enough towards the view that the Long Reflection is a likely outcome, they do seem strong enough to not dismiss the possibility of the scenario completely.

Why it’s worth thinking about this now

It may seem to some like the ability to make irreversible changes to the universe is a long way off, but there is a significant chance that this is possible in the next 100 years.

In a recent survey of ML researchers conducted by AI Impacts, the average result was a 50% chance of Human Level AI in 37 years. Additionally, "The median respondent thinks there is an “about even chance” that an argument given for an intelligence explosion is broadly correct." The forecasting website Metaculus predicts a median occurance of Artificial General Intelligence for the year 2034. This leads to the conclusion that a significant speedup of technological progress in the next few decades is a realistic possibility, which would allow for some of the irreversible changes listed in the relevant section above.

If we want policymakers to be sensible to issues addressed here we might want to start advocacy and research in this field as early as possible. Making a well thought out case for the Long Reflection or similar proposals can help contribute to increased awareness of this kind of macrostrategic reasoning. The Long Reflection is a good example of a concrete proposal in this field that can be argued over which is likely to contribute to its growth.

Conclusion

There is a fundamental paradox at the heart of the Long Reflection: in order to preserve option-value for the future humanity might have to take measures that themselves risk value lock-in. This means the whole concept is fraught with risk.

However, uncontrolled space expansion itself comes with risk and it seems likely that something like a Long Reflection could help to go about space expansion in a coordinated and controlled way.

Bibliography

[1] Fred Adams and Greg Laughlin. 2000. The five ages of the universe: inside the physics of eternity. Touchstone, London.

[2] Stuart Armstrong, Anders Sandberg, and Seán ÓhÉigeartaigh. 2015. Outrunning the Law: Extraterrestrial Liberty and Universal Colonisation. In The Meaning of Liberty Beyond Earth, Charles S. Cockell (ed.). Springer International Publishing, Cham, 165–186. DOI:https://doi.org/10.1007/978-3-319-09567-7_11 @

[3] Hanson, Robin. 2021. ‘Long Reflection’ Is Crazy Bad Idea. Overcoming Bias. Retrieved August 24, 2022 from https://www.overcomingbias.com/2021/10/long-reflection-is-crazy-bad-idea.html

[4] WILLIAM MACASKILL. 2022. What we owe the future: the million-year view. ONEWORLD PUBLICATIONS, S.l.

[5] Toby Ord. 2020. The precipice: existential risk and the future of humanity. Bloomsbury academic, london New York (N.Y.).

[6] Sandberg, Anders and Stuart Armstrong. 2013. Hunters in the dark: Game theory analysis of the deadly probes scenario. Poster Present. Natl. Astron. Meet. R. Astonomical Soc. NAM2013 (2013).

[7] Stocker, Felix. 2021. Reflecting on the Long Reflection. felixstocker.com. Retrieved August 24, 2022 from https://www.felixstocker.com/blog/reflecting-on-the-long-reflection

[8] Turchin, Alexey. Wireheading as a Possible Contributor to Civilizational Decline. 2018 .

This is one of the pieces written as part of the CHERI Summer Research Program 2022. I whole-heartedly thank the organisers for giving me this opportunity and Joseph Levin for being my mentor. Joseph was especially helpful working on this piece in supplying a draft of his.

A significant proportion of work during that time went into the Space Futures Initiative research agenda. I am planning to publish the rest of my work going forward.

Originally this was meant to be part of a larger sequence I planned to write. Since then I have decided against continuing that work and am publishing the drafts as they are. I might write about my view on Space Governance as a cause area in the future.

Larks10mo5

Thanks for sharing this!

You/readers might also be interested in my post here, published between when you wrote and published this, arguing the static nature of the long reflection with regards competition and material progress might undermine our ability to engage in truth-seeking inquiry.

Yannick_Muehlhaeuser10mo1

This is a piece that stayed in my drafts for almost a year and I just decided to publish it now, since I don't plan to work further on this issue. If I had written it more recently I would have made sure to include your work.

Thanks for the point. I do very much share that worry.

Effective Altruism Forum
EA Forum