Hide table of contents

As far as I can tell, there seems to be a strong tendency for those who are worried about AI risk to also be longtermists.

However many elements of the philosophical case for longtermism are independent of contingent facts about what is going to happen with AI in the coming decades.

If we have good community epistemic health, we should expect there to be people who object to longtermism on grounds like:

  • person-affecting views
  • supporting a non-zero pure discount rate

but who still are just as worried about AI as those with P(doom) > 90%.

Indeed, the proportion of "doomers" with those philosophical objections to longtermism should be just as high as the rate of such philosophical objections among those typically considered neartermist.

I'm interested in answers either of the form:

  • "hello, I'm both neartermist and have high P(doom) from AI risk..."; or
  • "here's some relevant data, from, say, the EA survey, or whatever"

52

0
0

Reactions

0
0
New Answer
New Comment

4 Answers sorted by

However many elements of the philosophical case for longtermism are independent of contingent facts about what is going to happen with AI in the coming decades.

Agree, though there are arguments from one to the other! In particular:

  1. As I understand it, longtermism requires it to be tractable to, in expectation, affect the long-term future ("ltf").[1]
  2. Some people might think that the only or most tractable way of affecting the ltf is to reduce extinction[2] risk in the coming decades or century (as you might think we can have no idea about the expected effects of basically anything else on the ltf because effects other than "causes ltf to exist or not" are too complicated to predict).
  3. If extinction risk is high, especially from a single source in the near future, it's plausibly easier to reduce. (this seems questionable but far from crazy)
  4. So thinking extinction risk is high especially from a single source in the near future might reasonably increase someone's belief in longtermism.
  5. Thinking AI risk is high in the near future is a way of thinking extinction risk is high from a ~single source in the near future
  6. So thinking AI risk is high in the near future is a reason to believe longtermism.

[1] basically because you can't have reasons to do things that are impossible.

[1] since "existential risk" on the toby ord definition by definition is anything that reduces humanity's potential (&therefore affects the ltf in expectation) I think it'd be confusing to use that term in this context so I'm going to talk about extinction even though people think there are non-extinction existential catastrophe scenarios from AI as well.

I'm a neartermist with 0.01<P(doom from AI)<0.05 on a 30-year horizon. I don't consider myself a doomer, but I think this qualifies as taking AI risk seriously (or at least not dismissing it entirely).

I think of my neartermism as a result of 3 questions:

  1. how much x-risk is there from AI?  
    As I said above, I think there's between a 1% and 5% chance of extinction from AI in the next 30 years. In my mind, this is high. If I were a longtermist, this would be sufficient to motivate to me to work on AI safety.
     
  2. how bad is x-risk?
    I am sympathetic to person-affecting views, which to me means thinking of x-risk as primarily impacting people (& animals) alive today. I'm also sympathetic to the idea that it's somewhat good to create a positive life. However, I'd really rather not create negative lives, and I think there is uncertainty on the sign of all not-yet-existent lives. As an example of this uncertainty, consider that many people raised in excellent conditions (loving family, great education, good healthcare, good friends) still struggle with depression. Because of this uncertainty and risk-aversion, the non-person-affecting views part of me is roughly neutral on creating lives as an altruistic act.
     
  3. how much can I lower x-risk?
    I have a technical skillset and could directly do AI safety work. However, I think most technical AI safety work still accelerates AI and therefore may accelerate extinction. As an example, I believe (weakly! convince me otherwise please!) that RLHF and instruction-tuning led to the current LLM gold rush and that if LLMs were more toxic (aka less safe?) there would be less investment in them right now. Along these lines, I'm not sure that any technical AI safety work done thus far has decreased AI x-risk.
    I think the best mechanism to lowering AI x-risk is to slow down AI development, both to give us more time in the current safe-ish technological world and perhaps time to shift into a paradigm where we can develop clearly beneficial technical safety tools. I imagine this deceleration to primarily happen through policy. Policy is outside my skillset, but I'd happily write a letter to my congressperson.
    If I could lower AI x-risk by 0.0001%  (I think of this as lowering P(doom) from 0.020000 to 0.019999, or 1 part in 20,000), I'd consider this worth 8 billion people * 1e-6 probability = 8e3 = 8000 deaths averted. I think I have better options to add this many QALYs over the course of my life - without the downside risk of potentially accelerating extinction!

 

Other reasons I'm not a longtermist / I don't do technical AI safety work:

  • I aspire to serve the poor and to serve animals rather than neglecting them or being served by them. I'm interested in working on problems that disproportionately impact the poor (eg pandemics) and not problems that would primarily impact the rich or even impact everyone equally (eg AI) in order to provide a preferential option for the poor. I'd like a world where more people live to 60 rather than one where some people live forever.
  • I'm risk-averse with my life's work. If I spent my life working on something that seemed like it might be good and ended up being totally useless, I'd consider that a wasted life.
  • I'm not impressed by things like the 80K problem profiles page putting "space governance" above "factory farming" and "easily preventable or treatable illness", or the Wytham Abbey purchase, or FTX, or the trend of spending money on elite students in rich countries without evidence rather than on people in poor countries with great evidence of the good that could be done. This is not the sort of altruism I want to be associated with.

I know some people in this category, mostly because they are extremely uncertain over what the best work is on AI risk.

I feel like I am a neartermist mostly because of my studies and my comparative advantages, neartermist seems more likely to be higher in empathetic leaning person (not sure how to phrase this). However, my tech and interaction with applied AI and geoscience has also allow me to recognise the danger for longtermist risks which with let me approach the research and discussion with open-mindedness despite my comparative advantage in neartermist causes. One of the main attractiveness of EA to me originally was because the movement address both of my concerns.

Comments9
Sorted by Click to highlight new comments since:
  • person-affecting views
  • supporting a non-zero pure discount rate

I think non-longtermists don't hold these premises; rather, they object to longtermism on tractability grounds.
 

supporting a non-zero pure discount rate

My subjective impression was this actually applied to a reasonable number of people in the longtermist social cluster. I think Will's technical definition is just not that close a fit for the community.

Indeed, the proportion of "doomers" with those philosophical objections to longtermism should be just as high as the rate of such philosophical objections among those typically considered neartermist.

 

I don't think we'll see this, largely because I expect having high AI x-risk estimates correlates with taking abstract arguments like longtermism seriously.

I'd also be pretty interested to get to know someone who thinks AI doom is inevitable and works to reduce suffering while we still have some power. I feel like some people who find AI alignment almost impossibly intractable should work on neartermist causes but I've never seen that happen.

I guess I am SORT OF one of those people you want to know.

I think AI non-allignment with humanity's interests is often very closely related to the non-allignment of capital investment and large corporate interests with humanity's interests.

That is because I think it is most likely that large capital investments will be creating the most powerful AGI in the future, and that investment is controlled by what the old Marxists call the class interest of international capital.

Thus, I think that it is inevitable that profit-maximizing AGI will have interests that diverge from humanity's interests to a greater and greater degree over time. This is similar to how the interests of the profit-maximizing wealthy humans diverge from the interests of the low-income global majority over time.

Thus the highly likely x-risk is that humans have little or no influence over society 100 to 1000 years from now. I think there is pretty close to zero chance that humans will become physically extinct (but this could be debated).

Thus I consider any neartermist work that strives to better align the interests of profit-maximizing wealthy humans with the interests of the current global low-income majority to be contributing strongly to the longtermist effort to align the interests of a future profit-maximizing AGI (which is likely to be the most powerful future AGI) with the interests of the majority of a future humanity.

Extrapolatinf wealth inequality trends, the vast majority of a future humanity is likely to be hundreds or thousands of times less wealthy than the most powerful AGIs.

This, my personal feeling is that neartermist "wealth inequality alignment" work is more likely to be effective than much longtermist work, because I think it is very likely that the vast majority of current longtermists have little or no verifiable knowledge about how to they can actually influence the nature or character of the future most powerful AGI's that are likely to dominate our future solar system. My personal "guess" is that it is those future AGIs that are likely to create the greatest risks for humanity.

Who these AGIs are or what they will be like, I don't think anyone really knows at the present time. This isn't because the problem is untractable in the classic sense, but it is because the chaotic system that will create the AGIs very likely has a very high internal dynamic instability that creates large sensitivity to initial conditions (which makes the result predictably unpredictable via the "butterfly effect").

I think that by working on neartermist problems, it is possible for us to change the societal conditions that are the initial conditions for the creation of future AGIs. I would argue that if we make our current human society such that humans are nicer to humans in the near term, then this will create a set of initial conditions where it is also more likely that future human-initiated AGI is nicer to humans in the future.

Thus, I think there is lots of neartermist work on wealth inequality that can potentially be very effective at preventing the future AGI x-risk of a completely marginalized humanity. I think that work is both more tractable, has far more certain results AND addresses some of our most probable longtermist x-risks.

Much of my past technical research was in the field of "Chaotic Dynamics" which I don't think has been applied to EA longtermist philosophy yet. My experience with the dynamics of complex systems makes me very skeptical about any forecasts of the particulars regarding the future of AGI and the agency that individuals are likely to have in creating that future.

That background is the technical context for my views.

Is that what you wanted to get to know?

I think a lot of people who are aware of AI risk for some time but nevertheless choose to work on some other causes, such as climate change, may implicitly hold this view.

Isn't the opposite end of the p(doom)–longtermism quadrant also relevant? E.g. my p(doom) is 2%, but I take the arguments for longtermism seriously and think that's high enough of a chance to justify working on the alignment problem.

interesting, I would instinctively still consider 2% to be a high p(doom) within the next 100 years. In the AI field, what is generally seen as a "high" or "low" p(doom)

However many elements of the philosophical case for longtermism are independent of contingent facts about what is going to happen with AI in the coming decades.

It could be that both acceptance of longtermism and ability to forecast AI accurately are caused by some shared underlying factor, e.g. ability to reason and think systematically correctly (or incorrectly).

Or, put another way: in general, for any two questions where there is an objectively correct answer, giving correct answers should be pretty correlated, even if the questions themselves are completely unrelated. Forecasting definitely has an objectively correct answer; not sure about longtermism vs. neartermism, but I think it's plausible that it will one day look settled, or at least come down to mostly epistemic uncertainty.

So I don't see why views on these topics should be uncorrelated, unless you think philosophical questions about longtermism vs. neartermism are simple questions of opinion or values differences with no epistemic uncertainty left, and that peoples' answers to them are unlikely to change under reflection.



 

More from Sanjay
Curated and popular this week
Relevant opportunities