Conversation with Holden Karnofsky, Nick Beckstead, and Eliezer Yudkowsky on the “long-run” perspective on effective altruism


Earlier this year, I had an email conversation with Holden Karnofsky, Eliezer Yudkowsky, and Luke Muehlhauser about future-oriented effective altruism, as a follow-up to an earlier conversation Holden had with Luke and Eliezer.

The conversation is now available here. My highlights from the conversation:

NICK: I think the case for “do the most good” coinciding with “do what is best in terms of very long-term considerations” rests on weaker normative premises than your conversation suggests it does. For example, I don’t believe you need the assumption that creating a life is as good as saving a life, or a constant fraction as good as that. I have discussed a more general kind of argument—as well as some of the most natural and common alternative moral frameworks I could think of—in my dissertation (especially ch. 3 and ch. 5). It may seem like a small point, but I think you can introduce a considerable amount of complicated holistic evaluation into the framework without undermining the argument for focusing primarily on long-term considerations.

For another point, you can have trajectory changes or more severe “flawed realizations” that don’t involve extinction. E.g., you could imagine a version of climate change where bad management of the problem results in the future being 1% worse forever or you could have a somewhat suboptimal AI that makes the future 1% worse than it could have been (just treat these as toy examples that illustrate a point rather than empirical claims). If you’ve got a big enough future civilization, these changes could plausibly outweigh short-term considerations (apart from their long-term consequences) even if you don’t think that creating a life is within some constant fraction of saving a life.

HOLDEN: On your first point – I think you’re right about the *far future* but I have more trouble seeing the connection to *x-risk* (even broadly defined). Placing a great deal of value on a 1% improvement seems to point more in the direction of working toward broad empowerment/improvement and weigh toward e.g. AMF. I think I need to accept the creating/saving multiplier to believe that “all the value comes from whether or not we colonize the stars.”

NICK: The claim was explicitly meant to be about “very long-term considerations.” I just mean to be speaking to your hesitations about the moral framework (rather than your hesitations about what the moral framework implies).

I agree that an increased emphasis on trajectory changes/flawed realizations (in comparison with creating extra people) supports putting more emphasis on factors like broad human empowerment relative to avoiding doomsday scenarios and other major global disruptions.

ELIEZER: How does AMF get us to a 1% better *long-term* future?  Are you
envisioning something along the lines of “Starting with a 1% more prosperous Earth results in 1% more colonization and hence 1% more utility by the time the stars finally burn out”?

HOLDEN: I guess so. A 1% better earth does a 1% better job in the SWH transition? I haven’t thought about this much and don’t feel strongly about what I said.


HOLDEN: Something Weird Happens – Eliezer’s term for what I think he originally intended Singularity to mean (or how I interpret Singularity).

(will write more later)

NICK: I feel that the space between your take on astronomical waste and Bostrom’s take is smaller than you recognize in this discussion and in discussions we’ve had previously. In the grand scheme of things, it seems the position you articulated (under the assumptions that future generations matter in the appropriate way) puts you closer to Bostrom than it does to (say) 99.9% of the population. I think most outsiders would see this dispute as analogous to a dispute between two highly specific factions of Marxism or something. As Eliezer said, I think your disagreement is more about how to apply maxipok than whether maxipok is right (in the abstract).[…]

I think there’s an interesting analogy with the animal rights people. Suppose you hadn’t considered the long-run consequences of helping people so much and you become convinced that animal suffering on factory farms is of comparable importance to billions of humans being tortured and killed each year, and that getting one person to be a vegetarian is like preventing many humans from being tortured and killed. Given that you accept this conclusion, I think it wouldn’t be unreasonable for you to update strongly in favor of factory farming being one of the most high priority areas for doing good in the world, even if you didn’t know a great deal about RFMF and so on. Anyway, it does seem pretty analogous in some important ways. This looks to me like a case where some animal rights people did something analogous to the process you critiqued and thereby identified factory farming,

HOLDEN: Re: Bostrom’s essay – I see things differently. I see “the far future is extremely important” as a reasonably mainstream position. There are a lot of mainstream people who place substantial value on funding and promoting science, for that exact reason. Certainly there are a lot of people who don’t feel this way, and I have arguments with them, but I don’t feel Bostrom’s essay tells us nearly as much when read as agreeing with me. I’d say it gives us a framework that may or may not turn out to be useful.

So far I haven’t found it to be particularly useful. I think valuing extinction prevention as equivalent to saving something like 5*N lives (N=current global population) leads to most of the same conclusions. Most of my experience with Bostrom’s essay has been people pointing to it as a convincing defense of a much more substantive position.

I think non-climate-change x-risks are neglected because of how diffuse their constituencies are (the classic issue), not so much because of apathy toward the far future, particularly not from failure to value the far future at [huge number] instead of 5*N.

NICK: […] Though I’m not particularly excited about refuges, they might be a good test case. I think that if you had this 5N view, refuges would be obviously dumb but if you had the view that I defended in my dissertation then refuges would be interesting from a conceptual perspective.

HOLDEN: One of the things I’m hoping to clarify with my upcoming posts is that my comfort with a framework is not independent of what the framework implies. Many of the ways in which you try to break down arguments do not map well onto my actual process for generating conclusions.

NICK: I’m aware that this isn’t how you operate. But doesn’t this seem like an “in the trenches” case where we’re trying to learn and clarify our reasoning, and therefore your post would suggest that now is a good time to do engage in sequence thinking?

HOLDEN: Really good question that made me think and is going to make me edit my post. I concede that sequence thinking has important superiorities for communication; I also think that it COULD be used to build a model of cluster thinking (this is basically what I tried to do in my post – define cluster thinking as a vaguely specified “formula”). One of the main goals of my post is to help sequence thinkers do a better job modeling and explicitly discussing what cluster thinking is doing.

What’s frustrating to me is getting accused of being evasive, inconsistent, or indifferent about questions like this far future thing; I’d rather be accused of using a process that is hard to understand by its nature (and shouldn’t be assumed to be either rational or irrational; it could be either or a mix).

Anyway, what I’d say in this case is:

  • I think we’ve hit diminishing returns on examining this particular model of the far future. I’ve named all the problems I see with it; I have no more. I concede that this model doesn’t have other holes that I’ve identified, for the moment. I’ve been wrong before re: thinking we’ve hit diminishing returns before we have, so I’m open to more questions.
  • In terms of how I integrate the model into my decisions, I cap its signal and give it moderate weight. “Action X would be robustly better if I accepted this model of the far future” is an argument in favor of action X but not a decisive one. This is the bit that I’ve previously had trouble defending as a principled action, and hopefully I’ve made some progress on that front. I don’t intend this statement to cut off discussion on the sequence thinking bit, because more argument along those lines could strengthen the robustness of the argument for me and increase its weight.

HOLDEN: Say that you buy Apple stock because “there’s a 10% chance that they develop a wearable computer over the next 2 years and this sells over 10x as well as the iPad has.’ I short Apple stock because “I think their new CEO sucks.” IMO, it is the case that you made a wild guess about the probability of the wearable computer thing, and it is not the case that I did.

NICK: I think I’ve understood your perspective for a while, I’m mainly talking about how to explain it to people.

I think this example clarifies the situation. If your P(Apple develops a wearable computer over the next 2 years and this sells over 10x as well as the iPad has) = 10%, then you’d want to buy apple stock. In this sense, if you short Apple stock, you’re committed to P(Apple develops a wearable computer over the next 2 years and this sells over 10x as well as the iPad has) < 10%. In this sense, you often can’t get out of being committed to ranges of subjective probabilities.

The way you think about it, the cognitive procedure is more like: ask a bunch of questions, give answers to the questions, give weights to your question/answer pairs, make a decision as a result. You’re “relying on an assumption” only if that assumption is your answer to one of the questions and you put a lot of weight on that question/answer pair. Since you just relied on the pair (How good is the CEO?, The CEO sucks), you didn’t rely on a wild guess about P(Apple develops a wearable computer over the next 2 years and this sells over 10x as well as the iPad has). And, in this sense, you can often avoid being committed to subjective probabilities.

When I first heard you say, “You’re relying on a wild guess,” my initial reaction was something like, “Holden is making the mistake of thinking that his actions don’t commit him to ranges of subjective probabilities (in the first sense). It looks like he hasn’t thought through the Bayesian perspective on this.” I do think this is a real mistake that people make, though they may (often?) be operating more on the kind of basis you have described . I started thinking you had a more interesting perspective when, when I was pressing you on this point, you said something like, “I’m committed to whatever subjective probability I’m committed to on the basis of the decision that’s an outcome of this cognitive procedure.”

Strategic considerations about different speeds of AI takeoff


Crossposted from the Global Priorities Project

Co-written by Owen Cotton-Barratt and Toby Ord

There are several different kinds of artificial general intelligence (AGI) which might be developed, and there are different scenarios which could play out after one of them reaches a roughly human level of ability across a wide range of tasks. We shall discuss some of the implications we can see for these different scenarios, and what that might tell us about how we should act today.

A key difference between different types of post-AGI scenario is the ‘speed of takeoff’. This could be thought of as the time between first reaching a near human-level artificial intelligence and reaching one that far exceeds our capacities in almost all areas (or reaching a world where almost all economically productive work is done by artificial intelligences). In fast takeoff scenarios, this might happen over a scale of months, weeks, or days. In slow takeoff scenarios, it might take years or decades. There has been considerable discussion about which speed of takeoff is more likely, but less discussion about which is more desirable and what that implies.

Are slow takeoffs more desirable?

There are a few reasons to think that we’re more likely to get a good outcome in a slow takeoff scenario.

First, safety work today has an issue of neartsightedness. Since we don’t know quite what form artificial intelligence will eventually take, specific work today may end up being of no help on the problem we eventually face. If we had a slow takeoff scenario, there would be a period of time in which AGI safety researchers had a much better idea of the nature of the threat, and were able to optimise their work accordingly. This could make their work several times more valuable.

Second, and perhaps more crucially, in a slow takeoff the concerns about AGI safety are likely to spread much more widely through society. It is easy to imagine this producing widespread societal support of a level at or exceeding that for work on climate change, because the issue would be seen to be imminent. This could translate to much more work on securing a good outcome — perhaps hundreds of times the total which had previously been done. Although there are some benefits to have work done serially rather than in parallel, these are likely to be overwhelmed by the sheer quantity of extra high-quality work which would attack the problem. Furthermore, the slower the takeoff, the more this additional work can also be done serially.

A third key factor is that a slow takeoff seems more likely to lead to a highly multipolar scenario. If AGI has been developed commercially, the creators are likely to licence out copies for various applications. Moreover it could give enough time for competitors to bring alternatives up to speed.

We don’t think it’s clear whether multipolar outcomes are overall a good thing, but we note that they have some advantages. In the short term they are likely to preserve something closer to the existing balance of power, which gives more time for work to ensure a safe future. They are additionally less sensitive to the prospect of a treacherous turn or of any single-point failure mode in an AGI.

Strategic implications

If we think that there will be much more time for safety work in slow takeoff scenarios, there seem to be two main implications:

First, when there is any chance to influence matters, we should generally push towards slow takeoff scenarios. They are likely to have much more safety work done, and this is a large factor which could easily outweigh our other information about the relative desirability of the scenarios.

Second, we should generally focus safety research today on fast takeoff scenarios. Since there will be much less safety work in total in these scenarios, extra work is likely to have a much larger marginal effect. This can be seen as hedging against a fast takeoff even if we think it is undesirable.

Overall it seems to us that the AGI safety community has internalised the second point, and sensibly focused on work addressing fast takeoff scenarios. It is less clear that we have appropriately weighed the first point. Either of these points could be strengthened or outweighed by a better understanding of the relevant scenarios.

For example, it seems that neuromorphic AGI would be much harder to understand and control than an AGI with a much clearer internal architecture. So conditional on a fast takeoff, it would be bad if the AGI were neuromorphic. People concerned with AGI safety have argued against a neuromorphic approach on these grounds. However, precisely because it is opaque, neuromorphic AGI may be less able to perform fast recursive self-improvement, and this would decrease the chance of a fast takeoff. Given how much better a slow takeoff appears, we should perhaps prefer neuromorphic approaches.

In general, the AGI safety community focuses much of its attention on recursive self-improvement approaches to designing a highly intelligent system. We think that this makes sense in as much as it draws attention to the dangers of fast takeoff scenarios and hedges against being in one, but we would want to take care not to promote the approach for those considering designing an AGI. Drawing attention to the power of recursive self improvement could end up being self-defeating if it encourages people to design such systems, producing a faster takeoff.
In conclusion it seems that when doing direct technical safety work, may be reasonable to condition on a fast takeoff, as that is the scenario where our early work matters most. When choosing strategic direction, however, it is a mistake to condition on a fast takeoff, precisely because our decisions may affect the probability of a fast takeoff.

Thanks to Daniel Dewey for conversations and comments.

A relatively atheoretical perspective on astronomical waste


Crossposted from the Global Priorities Project


It is commonly objected that the “long-run” perspective on effective altruism rests on esoteric assumptions from moral philosophy that are highly debatable. Yes, the long-term future may overwhelm aggregate welfare considerations, but does it follow that the long-term future is overwhelmingly important? Do I really want my plan for helping the world to rest on the assumption that the benefit from allowing extra people to exist scales linearly with population when large numbers of extra people are allowed to exist?

In my dissertation on this topic, I tried to defend the conclusion that the distant future is overwhelmingly important without committing to a highly specific view about population ethics (such as total utilitarianism). I did this by appealing to more general principles, but I did end up delving pretty deeply into some standard philosophical issues related to population ethics. And I don’t see how to avoid that if you want to independently evaluate whether it’s overwhelmingly important for humanity to survive in the long-term future (rather than, say, just deferring to common sense).

In this post, I outline a relatively atheoretical argument that affecting long-run outcomes for civilization is overwhelmingly important, and attempt to side-step some of the deeper philosophical disagreements. It won’t be an argument that preventing extinction would be overwhelmingly important, but it will be an argument that other changes to humanity’s long-term trajectory overwhelm short-term considerations. And I’m just going to stick to the moral philosophy here. I will not discuss important issues related to how to handle Knightian uncertainty, “robust” probability estimates, or the long-term consequences of accomplishing good in the short run. I think those issues are more important, but I’m just taking on one piece of the puzzle that has to do with moral philosophy, where I thought I could quickly explain something that may help people think through the issues.

In outline form, my argument is as follows:

  1. In very ordinary resource conservation cases that are easy to think about, it is clearly important to ensure that the lives of future generations go well, and it’s natural to think that the importance scales linearly with the number of future people whose lives will be affected by the conservation work.
  2. By analogy, it is important to ensure that, if humanity does survive into the distant future, its trajectory is as good as possible, and the importance of shaping the long-term future scales roughly linearly with the expected number of people in the future.
  3. Premise (2), when combined with the standard set of (admittedly debatable) empirical and decision-theoretic assumptions of the astronomical waste argument, yields the standard conclusion of that argument: shaping the long-term future is overwhelmingly important.

As when I have discussed this issue in other contexts (such as Nick Bostrom’s papers “Astronomical Waste” and “Existential Risk Prevention as Global Priority,” and my dissertation) this conversation is going to generally assume that we’re talking about good accomplished from an impartial perspective, and will not attend to deontological, virtue-theoretic, or justice-related considerations.

A review of the astronomical waste argument and an adjustment to it

The standard version of the astronomical waste argument runs as follows:

  1. The expected size of humanity’s future influence is astronomically great.
  2. If the expected size of humanity’s future influence is astronomically great, then the expected value of the future is astronomically great.
  3. If the expected value of the future is astronomically great, then what matters most is that we maximize humanity’s long-term potential.
  4. Some of our actions are expected to reduce existential risk in not-ridiculously-small ways.
  5. If what matters most is that we maximize humanity’s future potential and some of our actions are expected to reduce existential risk in not-ridiculously-small ways, what it is best to do is primarily determined by how our actions are expected to reduce existential risk.
  6. Therefore, what it is best to do is primarily determined by how our actions are expected to reduce existential risk.

I’ve argued for adjusting the last three steps of this argument in the following way:

4’.   Some of our actions are expected to change our development trajectory in not-ridiculously-small ways.

5’.   If what matters most is that we maximize humanity’s future potential and some of our actions are expected to change our development trajectory in not-ridiculously-small ways, what it is best to do is primarily determined by how our actions are expected to change our development trajectory.

6’.   Therefore, what it is best to do is primarily determined by how our actions are expected to change our development trajectory.

The basic thought here is that what the astronomical waste argument really shows is that future welfare considerations swamp short-term considerations, so that long-term consequences for the distant future are overwhelmingly important in comparison with purely short-term considerations (apart from long-term consequences that short-term consequences may produce).

Astronomical waste may involve changes in quality of life, rather than size of population

Often, the astronomical waste argument is combined with the idea that the best way to minimize astronomical waste is to minimize the probability of pre-mature human extinction. How important it is to prevent pre-mature human extinction is a subject of philosophical debate, and the debate largely rests on whether it is important to allow large numbers of people to exist in the future. So when someone complains that the astronomical waste argument rests on esoteric assumptions about moral philosophy, they are implicitly objecting to premise (2) or (3). They are saying that even if human influence on the future is astronomically great, maybe changing how well humanity exercises its long-term potential isn’t very important because maybe it isn’t important to ensure that there are a large number of people living in the future.

However, the concept of existential risk is wide enough to include any drastic curtailment to humanity’s long-term potential, and the concept of a “trajectory change” is wide enough to include any small but important change in humanity’s long-term development. And the value of these existential risks or trajectory changes need not depend on changes in the population. For example,

  • In “The Future of Human Evolution,” Nick Bostrom discusses a scenario in which evolutionary dynamics result in substantial decreases in quality of for all future generations, and the main problem is not a population deficit.
  • Paul Christiano outlined long-term resource inequality as a possible consequence of developing advanced machine intelligence.
  • I discussed various specific trajectory changes in a comment on an essay mentioned above.

There is limited philosophical debate about the importance of changes in the quality of life of future generations

The main group of people who deny that it is important that future people exist have “person-affecting views.” These people claim that if I must choose between outcome A and outcome B, and person X exists in outcome A but not outcome B, it’s not possible to affect person X by choosing outcome A rather than B. Because of this, they claim that causing people to exist can’t benefit them and isn’t important. I think this view suffers from fatal objections which I have discussed in chapter 4 of my dissertation, and you can check that out if you want to learn more. But, for the sake of argument, let’s agree that creating “extra” people can’t help the people created and isn’t important.

A puzzle for people with person-affecting views goes as follows:

Suppose that agents as a community have chosen to deplete rather than conserve certain resources. The consequences of that choice for the persons who exist now or will come into existence over the next two centuries will be “slightly higher” than under a conservation alternative (Parfit 1987, 362; see also Parfit 2011 (vol. 2), 218). Thereafter, however, for many centuries the quality of life would be much lower. “The great lowering of the quality of life must provide some moral reason not to choose Depletion” (Parfit 1987, 363). Surely agents ought to have chosen conservation in some form or another instead. But note that, at the same time, depletion seems to harm no one. While distant future persons, by hypothesis, will suffer as a result of depletion, it is also true that for each such person a conservation choice (very probably) would have changed the timing and manner of the relevant conception. That change, in turn, would have changed the identities of the people conceived and the identities of the people who eventually exist. Any suffering, then, that they endure under the depletion choice would seem to be unavoidable if those persons are ever to exist at all. Assuming (here and throughout) that that existence is worth having, we seem forced to conclude that depletion does not harm, or make things worse for, and is not otherwise “bad for,” anyone at all (Parfit 1987, 363). At least: depletion does not harm, or make things worse for, and is not “bad for,” anyone who does or will exist under the depletion choice.

The seemingly natural thing to say if you have a person-affecting view is that because conservation doesn’t benefit anyone, it isn’t important. But this is a very strange thing to say, and people having this conversation generally recognize that saying it involves biting a bullet. The general tenor of the conversation is that conservation is obviously important in this example, and people with person-affecting views need to provide an explanation consonant with that intuition.

Whatever the ultimate philosophical justification, I think we should say that choosing conservation in the above example is important, and this has something to do with the fact that choosing conservation has consequences that are relevant to the quality of life of many future people.

Intuitively, giving N times as many future people higher quality of life is N times as important

Suppose that conservation would have consequences relevant to 100 times as many people in case A than it would in case B. How much more important would conservation be in case A? Intuitively, it would be 100 times more important. This generally fits with Holden Karnofsky’s intuition that a 1/N probability of saving N lives is about as important as saving one life, for any N:

I wish to be the sort of person who would happily pay $1 for a robust (reliable, true, correct) 10/N probability of saving N lives, for astronomically huge N – while simultaneously refusing to pay $1 to a random person on the street claiming s/he will save N lives with it.

More generally, we could say:

Principle of Scale: Other things being equal, it is N times better (in itself) to ensure that N people in some position have higher quality of life than other people who would be in their position than it is to do this for one person.

I had to state the principle circuitously to avoid saying that things like conservation programs could “help” future generations, because according to people with person-affecting views, if our “helping” changes the identities of future people, then we aren’t “helping” anyone and that’s relevant. If I had said it in ordinary language, the principle would have said, “If you can help N people, that’s N times better than helping one person.” The principle could use some tinkering to deal with concerns about equality and so on, but it will serve well enough for our purposes.

The Principle of Scale may seem obvious, but even it would be debatable. You wouldn’t find philosophical agreement about it. For example, some philosophers who claim that additional lives have diminishing marginal value would claim that in situations where many people already exist, it matters much less if a person is helped. I attack these perspectives in chapter 5 of my dissertation, and you can check that out if you want to learn more. But, in any case, the Principle of Scale does seem pretty compelling—especially if you’re the kind of person that doesn’t have time for esoteric debates about population ethics—so let’s run with it.

Now for the most questionable steps: Let’s assume with the astronomical waste argument that the expected number of future people is overwhelming, and that it is possible to improve the quality of life for an overwhelming number of future people through forward-thinking interventions. If we combine this with the principle from the last paragraph and wave our hands a bit, we get the conclusion that shifting quality of life for an overwhelming number of future people is overwhelmingly more important than any short term consideration. And that is very close to what the long-run perspective says about helping future generations, though importantly different because this version of the argument might not put weight on preventing extinction. (I say “might not” rather than “would not” because if you disagree with the people with person-affecting views but accept the Principle of Scale outlined above, you might just accept the usual conclusion of the astronomical waste argument.)

Does the Principle of Scale break down when large numbers are at stake?

I have no argument that it doesn’t, but I note that (i) this wasn’t Holden Karnofsky’s intuition about saving N lives, (ii) it isn’t mine, and (iii) I don’t really see a compelling justification for it. The main reason I can think of for wanting it to break down is not liking the conclusion that affecting long-run outcomes for humanity is overwhelmingly important in comparison with short-term considerations.  If you really want to avoid the conclusion that shaping the long-term future is overwhelmingly important, I believe it would be better to accommodate this idea by appealing to other perspectives and a framework for integrating the insights of different perspectives—such as the one that Holden has talked about—rather than altering this perspective. For such people, my hope would be that reading this post would cause you to put more weight on the perspectives that place great importance on the future.


To wrap up, I’ve argued that:

  1. Reducing astronomical waste need not involve preventing human extinction—it can involve other changes in humanity’s long-term trajectory.
  2. While not widely discussed, the Principle of Scale is fairly attractive from an atheoretical standpoint.
  3. The Principle of Scale—when combined with other standard assumptions in the literature on astronomical waste—suggests that some trajectory changes would be overwhelmingly important in comparison with short-term considerations. It could be accepted by people who have person-affecting views or people who don’t want to get too bogged down in esoteric debates about moral philosophy.

The perspective I’ve outlined here is still philosophically controversial, but it is at least somewhat independent of the standard approach to astronomical waste. Ultimately, any take on astronomical waste—including ignoring it—will be committed to philosophical assumptions of some kind, but perhaps the perspective outlined would be accepted more widely, especially by people with temperaments consonant with effective altruism, than perspectives relying on more specific theories or a larger number of principles.

Agricultural research and development


Crossposted from the Giving What We Can blog

Foreword: The Copenhagen Consensus and other authors have highlighted the potential of agricultural R&D as a high-leverage opportunity. This was enough to get us interested in understanding the area better, so we asked David Goll, a Giving What We Can member and a professional economist, to investigate how it compares to our existing recommendations. – Owen Cotton-Barratt, Director of Research for Giving What We Can


Around one in every eight people suffers from chronic hunger, according to the Food and Agricultural Organisation’s most recent estimates (FAO, 2013). Two billion suffer from micronutrient deficiencies. One quarter of children are stunted. Increasing agricultural yields and therefore availability of food will be essential in tackling these problems, which are likely to get worse as population and income growth place ever greater pressure on supply. To some extent, yield growth can be achieved through improved use of existing technologies. Access to and use of irrigation, fertilizer and agricultural machinery remains limited in some developing countries. However, targeted research and development will also be required to generate new technologies (seeds, animal vaccines and so on) that allow burgeoning food demand to be met.

Agricultural research and development encompasses an extremely broad range of activities and potential innovations. A 2008 paper issued by Consultative Group on International Agricultural Research (von Braun et al., 2008), an international organization that funds and coordinates agricultural research, identifies 14 ‘best bets’. These include developing hybrid and inbred seeds with improved yield potential, better resistance to wheat rust, increased drought tolerance and added nutritional value, but also encompasses the development new animal vaccines, better fertilizer use and improved processing and management techniques for fisheries.

Notable successes in seed development seem to have generated immense social benefit. The high-yielding varieties that spread through the ‘Green Revolution’ are often credited with driving a doubling of rice and wheat yields in Asia from the late 60s to the 90s, saving hundreds of millions of people people from famine (see, for instance, Economist, 2014). Given the prevalence of hunger and the high proportion of the extremely poor that work as farmers, agricultural research and development seems to offer a potential opportunity for effective altruism.

Existing benefit-cost estimates are promising, though not spectacular. The Copenhagen Consensus project ranked R&D to increase yield enhancements as the sixth most valuable social investment available, behind deworming and micronutrient interventions but ahead of popular programmes such as conditional cash transfers for education (Copenhagen Consensus, 2012).

The calculations that fed into this decision were based on two main categories of benefit. First, higher yield seeds allow production of larger quantities of agricultural output at a lower cost, bolstering the income of farmers. Around 70 per cent of the African labour-force work in agriculture, many in smallholdings that generate little income above subsistence (IFPRI, 2012). Boosting gains from agriculture could clearly provide large benefits for many of the worst off. Second, decreased costs of production lead to lower prices for food, allowing consumers to purchase more or freeing up their income to be spent elsewhere.

Projecting out to 2050, these two types of benefit alone are expected to outweigh the costs of increased R&D by 16 to 1 (Hoddinott et al., 2012). By comparison, the benefit-cost ratios estimated within the same project for salt iodization (a form of micronutrient supplement) range between 15 to 1 and 520 to 1, with the latest estimates finding a benefit-cost ratio of 81 to 1 (Hoddinott et al., 2012), and most of the estimates reported to the Copenhagen Consensus panel for the benefit-cost ratio of conditional cash transfers for education fall between 10 to 1 and 2 to 1 (Orazem, 2012). Using a very crude method, we can also convert the benefit-cost ratios into approximate QALY terms. Using a QALY value of three times annual income and taking the income of the beneficiaries to be $4.50 a day (around average income per capita in Sub-Saharan Africa), agricultural R&D is estimated to generate a benefit equivalent to one QALY for every $304.

Other types of benefit were not tabulated in the Copenhagen Consensus study, but should also be high. Strains that are resistant to drought, for instance, could greatly reduce year-to-year variation in crop yields. More resilient seeds could mitigate the negative effects of climate change on agriculture. Lower food prices may lead to better child nutrition, with life-long improved health and productivity. Finally, higher yields may decrease the potential for conflict due to the pressure on limited land, food and water resources resulting from climate change and population growth. Each of these benefits alone may justify the costs of research and development but, with our limited knowledge, they are not easily quantified.

The high benefit-cost ratio found by the Copenhagen Consensus team is broadly consistent with other literature. Meta-analysis of 292 academic studies on this topic has found that the median rate of return of agricultural R&D is around 44% (Alston et al., 2000). A rate of return, in this sense, indicates the discount rate at which the costs of an investment are equal to the benefits – rather like the interest rate on a bank account. More recent studies, focusing on research in Sub-Saharan Africa, have found aggregate returns of 55% (Alene, 2008).

Unfortunately, the rate of return on investment is not directly comparable to a benefit-cost ratio; the methodology applied often deviates from the welfare based approach applied by the Copenhagen Consensus team and the two numbers cannot be accurately converted into similar terms. Nonetheless, a crude conversion method can be applied to reach a ballpark estimate of the benefit-cost ratio implied by these studies. Assuming a marginal increase in spending on research is borne upfront and that research generates a constant stream of equal benefits each year from then on, the benefit-cost ratio for an investment with a 44% rate of return at a 5% discount rate is 9 to 1.

There are, however, at least two reasons to treat these high benefit-cost estimates with skepticism.

First, estimating the effect of research and development is difficult. One problem is attribution. Growth in yields can be observed as can spending on research and development, but it is much more difficult to observe which spending on research led to which increase in yields. If yields grew last year in Ethiopia, was this the result of research that occurred two years ago or ten years ago? Were the improved yields driven by spending on research within Ethiopia, or was it a spillover from research conducted elsewhere in the region or, even, research conducted on another continent? Estimating the effect of R&D spend requires researchers to adopt a specific temporal and spatial model dictating which expenditures can effect which yields in which countries. Teasing out causality can therefore be tricky, and some studies have suggested that inappropriate attribution may have led to systematic bias in the available estimates (e.g. Alston et al., 2009).

Another problem is cherry picking. Estimates garnered from meta-analysis are likely to be upwardly biased because studies are much more likely to be conducted on R&D programmes that are perceived to be successful. Failed programmes, on the other hand, are likely to be ignored and, as a result, the research may paint an overly optimistic picture of the potential impact of R&D.

Second, for new technologies to have an impact on the poor, they need to be widely adopted. This step should not be taken for granted. Adoption rates for improved varieties of crops remain low throughout Africa; farmer-saved seeds, which are unlikely to be improved, account for around 80 per cent of planted seeds in Africa compared to a global average of 35 per cent (AGRA, 2013). To some extent, this is because previous research has been poorly targeted at regional needs. The high-yield varieties developed during the Green Revolution require irrigation or very high levels of rainfall. New seed development was focused on wheat and rice, rather than alternative crops such as sorghum, cassava and millet. High yielding varieties required extensive fertilizer use. All of these features rendered them unsuitable for the African context, and explain why it was not easy to replicate the Asian success story elsewhere (Elliot, 2010).

However, there are more structural features of many developing countries that will limit adoption. Lack of available markets for surplus production can mean that smallholders can see limited benefit from larger harvests, especially when new seeds are costly and require additional labour and expensive fertilizer. Weak property rights undermine incentives to invest, given that farmers may be unable to hold on to their surplus crop or sell it at a fair price. Unavailability of credit means that, even when it makes good economic sense for farmers to invest in improved seeds, they may not be able to raise the initial capital required. The benefit-cost estimates discussed above, based on a synthesis of evidence from a diverse set of contexts, may underestimate the difficulties with adoption in more challenging countries.

Even in Asia during the Green Revolution, high-yield varieties were adopted first and foremost by large agricultural interests rather than smallholders (Wiggins et al., 2013). If this was the case for newly developed seeds, the impact on the poorest would be more limited than suggested in the Copenhagen Consensus study. They could still benefit from lower food prices and increased employment in the agricultural sector, but in extreme scenarios smallholders may even lose out due to low cost competition from larger farms that adopt new seeds.

In combination, the difficulties with estimating the effects of R&D and the potential barriers to adoption suggest that the estimated benefit-cost ratios reported earlier are likely to be upwardly biased. The benefit-cost ratios estimated are also lower than those associated with Giving What We Can’s currently recommended charities. For instance, the $304 per QALY estimate based on the Copenhagen Consensus benefit-cost ratio, which appears to be at the higher end of the literature, compares unfavourably to GiveWell’s baseline estimate of $45 to $115 per DALY for insecticide treated bednets (GiveWell, 2013). The benefit-cost ratios also appear to be lower than those associated with micronutrient supplements, as discussed earlier. While there are significant benefits that remain unquantified within agricultural R&D, the same is also true for interventions based on bednet distribution, deworming and micronutrient supplements. As a result, while this area could yield individual high impact opportunities, the literature as it stands does not seem to support the claim that agricultural R&D is likely to be more effective than the best other interventions.


  • Food and Agricultural Organisation,’The State of Food and Agriculture 2013’ (2013)
  • von Braun, J., Fan, S., Meinzen-Dick, R., Rosegrant, M. and Nin Pratt, A., ‘What to Expect from Scaling Up CGIAR Investments and ‘Best Bet’ Programs’ (2008)
  • Copenhagen Consensus, ‘Expert Panel Findings’ (2012)
  • Hoddinott, J., Rosegrant, M. and Torero, M. ‘Investments to reduce hunger and undernutrition’ (2012)
  • Orazem, P. ‘The Case for Improving School Quality and Student Health as a Development Strategy’ (2012)
  • Alliance for Green Revolution in Africa, ‘Africa Agriculture Status Report 2013: Focus on Staple Crops’, (2013)
  • International Food Policy Research Institute, ‘2012 Global Food Policy Report’, (2012)
  • Elliot, K., ‘Pulling Agricultural Innovation and the Market Together’, (2010)
  • Wiggins, S., Farrington, J., Henley, G., Grist, N. and Locke, A. ‘Agricultural development policy: a contemporary agenda’ (2013)
  • Givewell, ‘Mass distribution of long-lasting insecticide-treated nets (LLINs)’, 2013,, retrieved July 10th 2014
  • The Economist, ‘A bigger rice bowl’, May 10th 2014

How to treat problems of unknown difficulty


Crossposted from the Global Priorities Project

This is the first in a series of posts which take aim at the question: how should we prioritise work on problems where we have very little idea of our chances of success. In this post we’ll see some simple models-from-ignorance which allow us to produce some estimates of the chances of success from extra work. In later posts we’ll examine the counterfactuals to estimate the value of the work. For those who prefer a different medium, I gave a talk on this topic at the Good Done Right conference in Oxford this July.


How hard is it to build an economically efficient fusion reactor? How hard is it to prove or disprove the Goldbach conjecture? How hard is it to produce a machine superintelligence? How hard is it to write down a concrete description of our values?

These are all hard problems, but we don’t even have a good idea of just how hard they are, even to an order of magnitude. This is in contrast to a problem like giving a laptop to every child, where we know that it’s hard but we could produce a fairly good estimate of how much resources it would take.

Since we need to make choices about how to prioritise between work on different problems, this is clearly an important issue. We can prioritise using benefit-cost analysis, choosing the projects with the highest ratio of future benefits to present costs. When we don’t know how hard a problem is, though, our ignorance makes the size of the costs unclear, and so the analysis is harder to perform. Since we make decisions anyway, we are implicitly making some judgements about when work on these projects is worthwhile, but we may be making mistakes.

In this article, we’ll explore practical epistemology for dealing with these problems of unknown difficulty.


We will use a simplifying model for problems: that they have a critical threshold D such that the problem will be completely solved when D resources are expended, and not at all before that. We refer to this as the difficulty of the problem. After the fact the graph of success with resources will look something like this:

Of course the assumption is that we don’t know D. So our uncertainty about where the threshold is will smooth out the curve in expectation. Our expectation beforehand for success with resources will end up looking something like this:

Assuming a fixed difficulty is a simplification, since of course resources are not all homogenous, and we may get lucky or unlucky. I believe that this is a reasonable simplification, and that taking these considerations into account would not change our expectations by much, but I plan to explore this more carefully in a future post.

What kind of problems are we looking at?

We’re interested in one-off problems where we have a lot of uncertainty about the difficulty. That is, the kind of problem we only need to solve once (answering a question a first time can be Herculean; answering it a second time is trivial), and which may not easily be placed in a reference class with other tasks of similar difficulty. Knowledge problems, as in research, are a central example: they boil down to finding the answer to a question. The category might also include trying to effect some systemic change (for example by political lobbying).

This is in contrast to engineering problems which can be reduced down, roughly, to performing a known task many times. Then we get a fairly good picture of how the problem scales. Note that this includes some knowledge work: the “known task” may actually be different each time. For example, proofreading two pages of text is quite the same, but we have a fairly good reference class so we can estimate moderately well the difficulty of proofreading a page of text, and quite well the difficulty of proofreading a 100,000-word book (where the length helps to smooth out the variance in estimates of individual pages).

Some knowledge questions can naturally be broken up into smaller sub-questions. However these typically won’t be a tight enough class that we can use this to estimate the difficulty of the overall problem from the difficult of the first few sub-questions. It may well be that one of the sub-questions carries essentially all of the difficulty, so making progress on the others is only a very small help.

Model from extreme ignorance

One approach to estimating the difficulty of a problem is to assume that we understand essentially nothing about it. If we are completely ignorant, we have no information about the scale of the difficulty, so we want a scale-free prior. This determines that the prior obeys a power law. Then, we update on the amount of resources we have already expended on the problem without success. Our posterior probability distribution for how many resources are required to solve the problem will then be a Pareto distribution. (Fallenstein and Mennen proposed this model for the difficulty of the problem of making a general-purpose artificial intelligence.)

There is still a question about the shape parameter of the Pareto distribution, which governs how thick the tail is. It is hard to see how to infer this from a priori reasons, but we might hope to estimate it by generalising from a very broad class of problems people have successfully solved in the past.

This idealised case is a good starting point, but in actual cases, our estimate may be wider or narrower than this. Narrower if either we have some idea of a reasonable (if very approximate) reference class for the problem, or we have some idea of the rate of progress made towards the solution. For example, assuming a Pareto distribution implies that there’s always a nontrivial chance of solving the problem at any minute, and we may be confident that we are not that close to solving it. Broader because a Pareto distribution implies that the problem is certainly solvable, and some problems will turn out to be impossible.

This might lead people to criticise the idea of using a Pareto distribution. If they have enough extra information that they don’t think their beliefs represent a Pareto distribution, can we still say anything sensible?

Reasoning about broader classes of model

In the previous section, we looked at a very specific and explicit model. Now we take a step back. We assume that people will have complicated enough priors and enough minor sources of evidence that it will in practice be impossible to write down a true distribution for their beliefs. Instead we will reason about some properties that this true distribution should have.

The cases we are interested in are cases where we do not have a good idea of the order of magnitude of the difficulty of a task. This is an imprecise condition, but we might think of it as meaning something like:

There is no difficulty X such that we believe the probability of D lying between X and 10X is more than 30%.

Here the “30%” figure can be adjusted up for a less stringent requirement of uncertainty, or down for a more stringent one.

Now consider what our subjective probability distribution might look like, where difficulty lies on a logarithmic scale. Our high level of uncertainty will smooth things out, so it is likely to be a reasonably smooth curve. Unless we have specific distinct ideas for how the task is likely to be completed, this curve will probably be unimodal. Finally, since we are unsure even of the order of magnitude, the curve cannot be too tight on the log scale.

Note that this should be our prior subjective probability distribution: we are gauging how hard we would have thought it was before embarking on the project. We’ll discuss below how to update this in the light of information gained by working on it.

The distribution might look something like this:

In some cases it is probably worth trying to construct an explicit approximation of this curve. However, this could be quite labour-intensive, and we usually have uncertainty even about our uncertainty, so we will not be entirely confident with what we end up with.

Instead, we could ask what properties tend to hold for this kind of probability distribution. For example, one well-known phenomenon which is roughly true of these distributions but not all probability distributions is Benford’s law.

Approximating as locally log-uniform

It would sometimes be useful to be able to make a simple analytically tractable approximation to the curve. This could be faster to produce, and easily used in a wider range of further analyses than an explicit attempt to model the curve exactly.

As a candidate for this role, we propose working with the assumption that the distribution is locally flat. This corresponds to being log-uniform. The smoothness assumptions we made should mean that our curve is nowhere too far from flat. Moreover, it is a very easy assumption to work with, since it means that the expected returns scale logarithmically with the resources put in: in expectation, a doubling of the resources is equally good regardless of the starting point.

It is, unfortunately, never exactly true. Although our curves may be approximately flat, they cannot be everywhere flat — this can’t even give a probability distribution! But it may work reasonably as a model of local behaviour. If we want to turn it into a probability distribution, we can do this by estimating the plausible ranges of D and assuming it is uniform across this scale. In our example we would be approximating the blue curve by something like this red box:

Obviously in the example the red box is not a fantastic approximation. But nor is it a terrible one. Over the central range, it is never out from the true value by much more than a factor of 2. While crude, this could still represent a substantial improvement on the current state of some of our estimates. A big advantage is that it is easily analytically tractable, so it will be quick to work with. In the rest of this post we’ll explore the consequences of this assumption.

Places this might fail

In some circumstances, we might expect high uncertainty over difficulty without everywhere having local log-returns. A key example is if we have bounds on the difficulty at one or both ends.

For example, if we are interested in X, which comprises a task of radically unknown difficulty plus a repetitive and predictable part of difficulty 1000, then our distribution of beliefs of the difficulty about X will only include values above 1000, and may be quite clustered there (so not even approximately logarithmic returns). The behaviour in the positive tail might still be roughly logarithmic.

In the other direction, we may know that there is a slow and repetitive way to achieve X, with difficulty 100,000. We are unsure whether there could be a quicker way. In this case our distribution will be uncertain over difficulties up to around 100,000, then have a spike. This will give the reverse behaviour, with roughly logarithmic expected returns in the negative tail, and a different behaviour around the spike at the upper end of the distribution.

In some sense each of these is diverging from the idea that we are very ignorant about the difficulty of the problem, but it may be useful to see how the conclusions vary with the assumptions.

Implications for expected returns

What does this model tell us about the expected returns from putting resources into trying to solve the problem?

Under the assumption that the prior is locally log-uniform, the full value is realised over the width of the box in the diagram. This is w = log(y) – log(x), where x is the value at the start of the box (where the problem could first be plausibly solved), y is the value at the end of the box, and our logarithms are natural. Since it’s a probability distribution, the height of the box is 1/w.

For any z between x and y, the modelled chance of success from investing z resources is equal to the fraction of the box which has been covered by that point. That is:

(1)Chance of success before reaching z resources = log(z/x)/log(y/x).

So while we are in the relevant range, the chance of success is equal for any doubling of the total resources. We could say that we expect logarithmic returns on investing resources.

Marginal returns

Sometimes of greater relevance to our decisions is the marginal chance of success from adding an extra unit of resources at z. This is given by the derivative of Equation (1):

(2)Chance of success from a marginal unit of resource at z = 1/zw.

So far, we’ve just been looking at estimating the prior probabilities — before we start work on the problem. Of course when we start work we generally get more information. In particular, if we would have been able to recognise success, and we have invested z resources without observing success, then we learn that the difficulty is at least z. We must update our probability distribution to account for this. In some cases we will have relatively little information beyond the fact that we haven’t succeeded yet. In that case the update will just be to curtail the distribution to the left of z and renormalise, looking roughly like this:

Again the blue curve represents our true subjective probability distribution, and the red box represents a simple model approximating this. Now the simple model gives slightly higher estimated chance of success from an extra marginal unit of resources:

(3)Chance of success from an extra unit of resources after z = 1/(z*(ln(y)-ln(z))).

Of course in practice we often will update more. Even if we don’t have a good idea of how hard fusion is, we can reasonably assign close to zero probability that an extra $100 today will solve the problem today, because we can see enough to know that the solution won’t be found imminently. This looks like it might present problems for this approach. However, the truly decision-relevant question is about the counterfactual impact of extra resource investment. The region where we can see little chance of success has a much smaller effect on that calculation, which we discuss below.

Comparison with returns from a Pareto distribution

We mentioned that one natural model of such a process is as a Pareto distribution. If we have a Pareto distribution with shape parameter α, and we have so far invested z resources without success, then we get:

(4) Chance of success from an extra unit of resources = α/z.

This is broadly in line with equation (3). In both cases the key term is a factor of 1/z. In each case there is also an additional factor, representing roughly how hard the problem is. In the case of the log-linear box, this depends on estimating an upper bound for the difficulty of the problem; in the case of the Pareto distribution it is handled by the shape parameter. It may be easier to introspect and extract a sensible estimate for the width of the box than for the shape parameter, since it is couched more in terms that we naturally understand.

Further work

In this post, we’ve just explored a simple model for the basic question of how likely success is at various stages. Of course it should not be used blindly, as you may often have more information than is incorporated into the model, but it represents a starting point if you don’t know where to begin, and it gives us something explicit which we can discuss, critique, and refine.

In future posts, I plan to:

  • Explore what happens in a field of related problems (such as a research field), and explain why we might expect to see logarithmic returns ex post as well as ex ante.
    • Look at some examples of this behaviour in the real world.
  • Examine the counterfactual impact of investing resources working on these problems, since this is the standard we should be using to prioritise.
  • Apply the framework to some questions of interest, with worked proof-of-concept calculations.
  • Consider what happens if we relax some of the assumptions or take different models.

Ben Kuhn on the effective altruist movement


Ben Kuhn is a data scientist and engineer at a small financial technology firm. He previously studied mathematics and computer science at Harvard, where he was also co-president of Harvard College Effective Altruism. He writes on effective altruism and other topics at his website.

Pablo: How did you become involved in the EA movement?

Ben: When I was a sophomore in high school (that’s age 15 for non-Americans), Peter Singer gave his The Life You Can Save talk at my high school. He went through his whole “child drowning in the pond” spiel and explained that we were morally obligated to give money to charities that helped those who were worse off than us. In particular, I think at that point he was recommending donating to Oxfam in a sort of Kantian way where you gave an amount of money such that if everyone gave the same percentage it would eliminate world poverty. My friends and I realized that there was no utilitarian reason to stop at that amount of money–you should just donate everything that you didn’t need to survive.

So, being not only sophomores but also sophomoric, we decided that since Prof. Singer didn’t live in a cardboard box and wear only burlap sacks, he must be a hypocrite and therefore not worth paying attention to.

Sometime in the intervening two years I ran across Yvain’s essay Efficient Charity: Do Unto Others and through it GiveWell. I think that was the point where I started to realize Singer might have been onto something. By my senior year (ages 17-18) I at least professed to believe pretty strongly in some version of effective altruism, although I think I hadn’t heard of the term yet. I wrote an essay on the subject in a publication that my writing class put together. It was anonymous (under the brilliant nom de plume of “Jenny Ross”) but somehow my classmates all figured out it was me.

The next big update happened during the spring of my first year of Harvard, when I started going to the Cambridge Less Wrong meetups and met Jeff and Julia. Through some chain of events they set me up with the folks who were then running Harvard High-Impact Philanthropy (which later became Harvard Effective Altruism). After that spring, almost everyone else involved in HHIP left and I ended up becoming president. At that point I guess I counted as “involved in the EA movement”, although things were still touch-and-go for a while until John Sturm came onto the scene and made HHIP get its act together and actually do things.

Pablo: In spite of being generally sympathetic to EA ideas, you have recently written a thorough critique of effective altruism.  I’d like to ask you a few questions about some of the objections you raise in that critical essay.  First, you have drawn a distinction between pretending to try and actually trying.  Can you tell us what you mean by this, and why do you claim that a lot of effective altruism can be summarized as “pretending to actually try”?

Ben: I’m not sure I can explain better than what I wrote in that post, but I’ll try to expand on it. For reference, here’s the excerpt that you referred to:

By way of clarification, consider a distinction between two senses of the word “trying”…. Let’s call them “actually trying” and “pretending to try”. Pretending to try to improve the world is something like responding to social pressure to improve the world by querying your brain for a thing which improves the world, taking the first search result and rolling with it. For example, for a while I thought that I would try to improve the world by developing computerized methods of checking informally-written proofs, thus allowing more scalable teaching of higher math, democratizing education, etc. Coincidentally, computer programming and higher math happened to be the two things that I was best at. This is pretending to try. Actually trying is looking at the things that improve the world, figuring out which one maximizes utility, and then doing that thing. For instance, I now run an effective altruist student organization at Harvard because I realized that even though I’m a comparatively bad leader and don’t enjoy it very much, it’s still very high-impact if I work hard enough at it. This isn’t to say that I’m actually trying yet, but I’ve gotten closer.

Most people say they want to improve the world. Some of them say this because they actually want to improve the world, and some of them say this because they want to be perceived as the kind of person who wants to improve the world. Of course, in reality, everyone is motivated by other people’s perceptions to some extent–the only question is by how much, and how closely other people are watching. But to simplify things let’s divide the world up into those two categories, “altruists” and “signalers.”

If you’re a signaler, what are you going to do? If you don’t try to improve the world at all, people will notice that you’re a hypocrite. On the other hand, improving the world takes lots of resources that you’d prefer to spend on other goals if possible. But fortunately, looking like you’re improving the world is easier than actually improving the world. Since people usually don’t do a lot of due diligence, the kind of improvements that signallers make tend to be ones with very good appearances and surface characteristics–like PlayPumps, water-pumping merry-go-rounds which initially appeared to be a clever and elegant way to solve the problem of water shortage in developing countries. PlayPumps got tons of money and celebrity endorsements, and their creators got lots of social rewards, even though the pumps turned out to be hideously expensive, massively inefficient, prone to breaking down, and basically a disaster in every way.

So in this oversimplified world, the EA observation that “charities vary in effectiveness by orders of magnitude” is explained by “charities” actually being two different things: one group optimizing for looking cool, and one group optimizing for actually doing good. A large part of effective altruism is realizing that signaling-charities (“pretending to try”) often don’t do very much good compared to altruist-charities.

(In reality, of course, everyone is driven by some amount of signalling and some amount of altruism, so these groups overlap substantially. And there are other motivations for running a charity, like being able to convince yourself that you’re doing good. So it gets messier, but I think the vastly oversimplified model above is a good illustration of where my point is coming from.)

Okay, so let’s move to the second paragraph of the post you referenced:

Using this distinction between pretending and actually trying, I would summarize a lot of effective altruism as “pretending to actually try”. As a social group, effective altruists have successfully noticed the pretending/actually-trying distinction. But they seem to have stopped there, assuming that knowing the difference between fake trying and actually trying translates into ability to actually try. Empirically, it most certainly doesn’t. A lot of effective altruists still end up satisficing—finding actions that are on their face acceptable under core EA standards and then picking those which seem appealing because of other essentially random factors. This is more likely to converge on good actions than what society does by default, because the principles are better than society’s default principles. Nevertheless, it fails to make much progress over what is directly obvious from the core EA principles. As a result, although “doing effective altruism” feels like truth-seeking, it often ends up being just a more credible way to pretend to try.

The observation I’m making here is roughly that EA seems not to have switched entirely to doing good for altruistic rather than signaling reasons. It’s more like we’ve switched to signaling that we’re doing good for altruistic rather than signaling reasons. In other words, the motivation didn’t switch from “looking good to outsiders” to “actually being good”–it switched from “looking good to outsiders” to “looking good to the EA movement.”

Now, the EA movement is way better than random outsiders at distinguishing between things with good surface characteristics and things that are actually helpful, so the latter criterion is much stricter than the former, and probably leads to much more good being done per dollar. (For instance, I doubt the EA community would ever endorse something like PlayPumps.) But, at least at the time of writing that post, I saw a lot of behavior that seemed to be based on finding something pleasant and with good surface appearances rather than finding the thing that optimized utility–for instance, donating to causes without a particularly good case that they were better than saving or picking career options that seemed decent-but-not-great from an EA perspective. That’s the source of the phrase “pretending to actually try”–the signaling isn’t going away, it’s just moving up a level in the hierarchy, to signaling that you don’t care about signaling.

Looking back on that piece, I think “pretending to actually try” is still a problem, but my intuition is now that it’s probably not huge in the scheme of things. I’m not quite sure why that is, but here are some arguments against it being very bad that have occurred to me:

  • It’s probably somewhat less prevalent than I initially thought, because the EAs making weird-seeming decisions may be doing them for reasons that aren’t transparent to me and that get left out by the typical EA analysis. The typical EA analysis tends to be a 50000-foot average-case argument that can easily be invalidated by particular personal factors.
  • As Katja Grace points out, encouraging pretending to really try might be optimal from a movement-building perspective, inasmuch as it’s somewhat inescapable and still leads to pretty good results.
  • I probably overestimated the extent to which motivated/socially-pressured life choices are bad, for a couple reasons. I discounted the benefit of having people do a diversity of things, even if the way they came to be doing those things wasn’t purely rational. I also discounted the cost of doing something EA tells you to do instead of something you also want to do.
  • For instance, suppose for the sake of argument that there’s a pretty strong EA case that politics isn’t very good (I know this isn’t actually true). It’s probably good for marginal EAs to be dissuaded from going into politics by this, but I think it would still be bad for every single EA to be dissuaded from going into politics, for two reasons. First, the arguments against politics might turn out to be wrong, and having a few people in politics hedges against that case. Second, it’s much easier to excel at something you’re motivated at, and the category of “people who are excellent at what they do” is probably as important to the EA movement as “people doing job X” for most X.

I also just haven’t noticed as much pretending-stuff going on in the last few months, so maybe we’re just getting better at avoiding it (or maybe I’m getting worse at noticing it). Anyway, I still definitely think there’s pretending-to-actually-try going on, but I don’t think it’s a huge problem.

Pablo: In another section of that critique, you express surprise at the fact that so many effective altruists donate to global health causes now.  Why would you expect EAs to use their money in other ways–whether it’s donating now to other causes, or donating later–, and what explains, in your opinion, this focus on causes for which we have relatively good data?

Ben; I’m no longer sure enough of where people’s donations are going to say with certainty that too much is going to global health. My update here is from of a combination of being overconfident when I wrote the piece, and what looks like an increase in waiting to donate shortly after I wrote it. The latter was probably due in large part to AMF’s delisting and perhaps the precedent set by GiveWell employees, many of whom waited last year (though others argued against it). (Incidentally, I’m excited about the projects going on to make this more transparent, e.g. the questions on the survey about giving!)

The giving now vs. later debate has been ably summarized by Julia Wise on the EA blog. My sense from reading various arguments for both sides is that I more often see bad arguments for giving now. There are definitely good arguments for giving at least some money now, but on balance I suspect I’d like to see more saving. Again, though, I don’t have a great idea of what people’s donation behavior actually is; my samples could easily be biased.

I think my strongest impression right now is that I suspect we should be exploring more different ways to use our donations. For instance, some people who are earning to give have experimented with funding people to do independent research, which was a pretty cool idea. Off the top of my head, some other things we could try include scholarships, essay contest prizes, career assistance for other EAs, etc. In general it seems like there are tons of ways to use money to improve the world, many of which haven’t been explored by GiveWell or other evaluators and many of which don’t even fall in the category of things they care about (because they’re too small or too early-stage or something), but we should still be able to do something about them.

Pablo: In the concluding section of your essay, you propose that self-awareness be added to the list of principles that define effective altruism. Any thoughts on how to make the EA movement more self-aware?

Ben: One thing that I like to do is think about what our blind spots are. I think it’s pretty easy to look at all the stuff that is obviously a bad idea from an EA point of view, and think that our main problem is getting people “on board” (or even “getting people to admit they’re wrong”) so that they stop doing obviously bad ideas. And that’s certainly helpful, but we also have a ways to go just in terms of figuring things out.

For instance, here’s my current list of blind spots–areas where I wish there were a lot more thinking and idea-spreading going on then there currently is:

  • Being a good community. The EA community is already having occasional growing pains, and this is only going to get worse as we gain steam e.g. with Will MacAskill’s upcoming book. And beyond that, I think that ways of making groups more effective (as opposed to individuals) have a lot of promise for making the movement better at what we do. Many, many intellectual groups fail to accomplish their goals for basically silly reasons, while seemingly much worse groups do much better on this dimension. It seems like there’s no intrinsic reason we should be worse than, say, Mormons at building an effective community, but we’re clearly not there yet. I think there’s absolutely huge value in getting better at this, yet almost no one putting in a serious concerted effort.
  • Knowing history. Probably as a result of EA’s roots in math/philosophy, my impression is that our average level of historical informedness is pretty low, and that this makes us miss some important pattern-matches and cues. For instance, I think a better knowledge of history could help us think about capacity-building interventions, policy advocacy, and community building.
  • Fostering more intellectual diversity. Again because of the math/philosophy/utilitarianism thing, we have a massive problem with intellectual monoculture. Of my friends, the ones I enjoy talking about altruism the most with now are largely actually the ones who associate least with the broader EA community, because they have more interesting and novel perspectives.
  • Finding individual effective opportunities. I suspect that there’s a lot of room for good EA opportunities that GiveWell hasn’t picked up on because they’re specific to a few people at a particular time. Some interesting stuff has been done in this vein in the past, like funding small EA-related experiments, funding people to do independent secondary research, or giving loans to other EAs investing in themselves (at least I believe this has been done). But I’m not sure if most people are adequately on the lookout for this kind of opportunity.

(Since it’s not fair to say “we need more X” without specifying how we get it, I should probably also include at least one anti-blind spots that I think we should be spending fewer resources on, on the margin: Object-level donations to e.g. global health causes. I feel like we may be hitting diminishing returns here. Probably donating some is important for signalling reasons, but I think it doesn’t have a very high naive expected value right now.)

Pablo: Finally, what are your plans for the mid-term future?  What EA-relevant activities will you engage in over the next few years, and what sort of impact do you expect to have?

Ben: A while ago I did some reflecting and realized that most of the things I did that I was most happy about were pretty much unplanned–they happened not because I carefully thought things through and decided that they were the best way to achieve some goal, but because they intuitively seemed like a cool thing to do. (Things in this category include starting a blog, getting involved in the EA/rationality communities, running Harvard Effective Altruism, getting my current job, etc.) As a result, I don’t really have “plans for the mid-term future” per se. Instead, I typically make decisions based on intuitions/heuristics about what will lead to the best opportunities later on, without precisely knowing (or even knowing at all, often) what form those opportunities will take.

So I can’t tell you what I’ll be doing for the next few years–only that it will probably follow some of my general intuitions and heuristics:

  • Do lots of things. The more things I do, the more I increase my “luck surface area” to find awesome opportunities.
  • Do a few things really well. The point of this heuristic is hopefully obvious.
  • Do things that other people aren’t doing–or more accurately, things that not enough people are doing relative to how useful or important they are. My effort is most likely to make a difference in an area that is relatively under-resourced.

I’d like to take a moment here to plug the conference call on altruistic career choice that Holden Karnofsky of GiveWell had, which makes some great specific points along these lines.

Anyway, that’s my long-winded answer to the first part of this question. As far as EA-relevant activities and impacts, all the same caveats apply as above, but I can at least go over some things I’m currently interested in:

  • Now that I’m employed full-time, I need to start thinking much harder about where exactly I want to give: both what causes seem best, and which interventions within those causes. I actually currently don’t have much of a view on what I would do with more unrestricted funds.
  • Related to the point above about self-awareness, I’m interested in learning some more EA-relevant history–how previous social movements have worked out, how well various capacity-building interventions have worked, more about policy and the various systems that philanthropy comes into contact with, etc.
  • I’m interested to see to what extent the success of Harvard Effective Altruism can be sustained at Harvard and replicated at other universities.

I also have some more speculative/gestational interests–I’m keeping my eye on these, but don’t even have concrete next steps in mind:

  • I think there may be under-investment in healthy EA community dynamics, preventing common failure modes like unfriendliness, ossification to new ideas, groupthink etc.–though I can’t say for sure because I don’t have a great big-picture perspective of the EA community.
  • I’m also interested in generally adding more intellectual/epistemic diversity to EA–we have something of a monoculture problem right now. Anecdotally, there are a number of people who I think would have a really awesome perspective on many problems that we face, but who get turned off of the community for one reason or another.

Crossposted from Pablo’s blog

Audio recordings from Good Done Right available online


This July saw the first academic conference on effective altruism. The three-day event took place at All Souls College, one of the constituent colleges of the University of Oxford. The conference featured a diverse range of speakers addressing issues related to effective altruism in a shared setting. It was a fantastic opportunity to share insights and ideas from some of the best minds working on these issues.

I’m very pleased to announce that audio recordings from most of the talks are now available on the conference website, alongside speakers’ slides (where applicable). I’m very grateful to all of the participants for their fantastic presentations, and to All Souls College and the Centre for Effective Altruism for supporting the conference.

Crossposted from the Giving What We Can blog

‘Special Projects’ at the Centre for Effective Altruism


This is a short overview of a talk that I gave alongside William MacAskill and Owen Cotton-Barratt at the Centre for Effective Altruism Weekend Away last weekend.  This post does not contain new information for people familiar with the Centre for Effective Altruism’s work.  

New projects at the Centre for Effective Altruism are incubated within the Special Projects team.  We carry out a number of activities before choosing which ones to scale up.  The projects that we are currently working on are listed below.

Screen Shot 2014-06-20 at 2.17.06 pmThe Global Priorities Project is a joint research initiative between the Future of Humanity Institute at the University of Oxford and the Centre for Effective Altruism.  It attempts to prioritise between the pressing problems currently facing the world in order to establish in which areas we might have the most impact.  You can read more on about the project here.

Through the Global Priorities Project we are also engaged in policy advising for the UK Government.  Our first report to be published under this initiative is on unprecedented technological risk.  Our team regularly visits Government departments and No. 10 Downing Street to discuss policy proposals that we are developing as part of this work.

We are also scaling up our effective altruism outreach.  As part of this work we are developing into a landing page for people new to effective altruism.  We are also developing outreach activities to coincide with the release of multiple books on effective altruism in 2015, including one by our co-founder William MacAskill which will be published by Penguin in USA, and Guardian Faber (the publishing house of the national newspaper) in the UK.

We have also launched Effective Altruism Ventures, a commercial company that will hold the rights to William MacAskill’s upcoming book, which will also engage in outreach activities related to effective altruism.  This company is not part of the Centre for Effective Altruism.

If you have any questions about any of these projects, please do not hesitate to contact me at or in the comments below.

The timing of labour aimed at reducing existential risk


Crossposted from the Global Priorities Project

Work towards reducing existential risk is likely to happen over a timescale of decades. For many parts of this work, the benefits of that labour is greatly affected by when it happens. This has a large effect when it comes to strategic thinking about what to do now in order to best help the overall existential risk reduction effort. I look at the effects of nearsightedness, course setting, self-improvement, growth, and serial depth, showing that there are competing considerations which make some parts of labour particularly valuable earlier, while others are more valuable later on. We can thus improve our overall efforts by encouraging more meta-level work on course setting, self-improvement, and growth over the next decade, with more of a focus on the object-level research on specific risks to come in decades beyond that.


Suppose someone considers AI to be the largest source of existential risk, and so spends a decade working on approaches to make self-improving AI safer. It might later become clear that AI was not the most critical area to worry about, or that this part of AI was not the most critical part, or that this work was going to get done anyway by mainstream AI research, or that working on policy to regulate research on AI was more important than working on AI. In any of these cases she wasted some of the value of her work by doing it now. She couldn’t be faulted for lack of omniscience, but she could be faulted for making herself unnecessarily at the mercy of bad luck. She could have achieved more by doing her work later, when she had a better idea of what was the most important thing to do.

We are nearsighted with respect to time. The further away in time something is, the harder it is to perceive its shape: its form, its likelihood, the best ways to get purchase on it. This means that work done now on avoiding threats in the far future can be considerably less valuable than the same amount of work done later on. The extra information we have when the threat is up close lets us more accurately tailor our efforts to overcome it.

Other things being equal, this suggests that a given unit of labour directed at reducing existential risk is worth more the later in time it comes.

Course setting, self-improvement & growth

As it happens, other things are not equal. There are at least three major effects which can make earlier labour matter more.

The first of these is if it helps to change course. If we are moving steadily in the wrong direction, we would do well to change our course, and this has a larger benefit the earlier we do so. For example, perhaps effective altruists are building up large resources in terms of specialist labour directed at combatting a particular existential risk, when they should be focusing on more general purpose labour. Switching to the superior course sooner matters more, so efforts to determine the better course and to switch onto it matter more the earlier they happen.

The second is if labour can be used for self-improvement. For example, if you are going to work to get a university degree, it makes sense to do this earlier in your career rather than later as there is more time to be using the additional skills. Education and training, both formal and informal, are major examples of self-improvement. Better time management is another, and so is gaining political or other influence. However this category only includes things that create a lasting improvement to your capacities and that require only a small upkeep. We can also think of self-improvement for an organisation. If there is benefit to be had from improved organisational efficiency, it is generally better to get this sooner. A particularly important form is lowering the risk of the organisation or movement collapsing, or cutting off its potential to grow.

The third is if the labour can be used to increase the amount of labour we have later. There are many ways this could happen, several of which give exponential growth. A simple example is investment. An early hour of labour could be used to gain funds which are then invested. If they are invested in a bank or the stock market, one could expect a few percent real return, letting you buy twice as much labour two or three decades later. If they are invested in raising funds through other means (such as a fundraising campaign) then you might be able to achieve a faster rate of growth, though probably only over a limited number of years until you are using a significant fraction of the easy opportunities.

A very important example of growth is movement building: encouraging other people to dedicate part of their own labour or resources to the common cause, part of which will involve more movement building. This will typically have an exponential improvement with the potential for double digit percentage growth until the most easily reached or naturally interested people have become part of the movement at which point it will start to plateau. An extra hour of labour spent on movement building early on, could very well produce a hundred extra hours of labour to be spent later. Note that there might be strong reasons not to build a movement as quickly as possible: rapid growth could involve increasing the signal to noise ratio in the movement, or changing its core values, or making it more likely to collapse, and this would have to be balanced against the benefits of growth sooner.

If the growth is exponential for a while but will spend a lot of time stuck at a plateau, it might be better in the long term to think of it like self improvement. An organisation might have been able to raise $10,000 of funds per year after costs before the improvement and then gains the power to raise $1,000,000 of funds per year afterwards — only before it hits the plateau does it have the exponential structure characteristic of growth.

Finally, there is a matter of serial depth. Some things require a long succession of stages each of which must be complete before the next begins. If you are building a skyscraper, you will need to build the structure for one story before you can build the structure for the next. You will therefore want to allow enough time for each of these stages to be completed and might need to have some people start building soon. Similarly, if a lot of novel and deep research needs to be done to avoid a risk, this might involve such a long pipeline that it could be worth starting it sooner to avoid the diminishing marginal returns that might come from labour applied in parallel. This effect is fairly common in computation and labour dynamics (see The Mythical Man Month), but it is the factor that I am least certain of here. We obviously shouldn’t hoard research labour (or other resources) until the last possible year, and so there is a reason based on serial depth to do some of that research earlier. But it isn’t clear how many years ahead of time it needs to start getting allocated (examples from the business literature seem to have a time scale of a couple of years at most) or how this compares to the downsides of accidentally working on the wrong problem.


We have seen that nearsightedness can provide a reason to delay labour, while course setting, self-improvement, growth, and serial depth provide reasons to use labour sooner. In different cases, the relative weights of these reasons will change. The creation of general purpose resources such as political influence, advocates for the cause, money, or earning potential, is especially resistant to the nearsightedness problem as they have more flexibility to be applied to whatever the most important final steps happen to be. Creating general purpose resources, or doing course setting, self-improvement, or growth are thus comparatively better to do in the earlier times. Direct work on the cause is comparatively better to do later on (with a caveat about allowing enough time to allow for the required serial depth).

In the case of existential risk, I think that many of the percentage points of total existential risk lie decades or more in the future. There is quite plausibly more existential risk in the 22nd century than in the 21st. For AI risk, the recent FHI survey of 174 experts, the median estimate for when there would be a 50% chance of reaching roughly human level AI was 2040. For the subgroup of those who are part of the ‘Top 100′ researchers in AI, it was 2050. This gives something like 25 to 35 years before we think most of this risk will occur. That is a long time and will produce a large nearsightedness problem for conducting specific research now and a large potential benefit for course setting, self-improvement, and growth. Given a portfolio of labour to reduce risk over that time, it is particularly important to think about moving types of labour towards the times where they have a comparative advantage. If we are trying to convince others to help use their careers to reduce this risk, the best career advice might change over the coming decades from help with movement building or course setting, to accumulating more flexible resources, to doing specialist technical work.

The temporal location of a unit of labour can change its value by a great deal. It is quite plausible that due to nearsightedness, doing specific research now could have less than a tenth the expected value of doing it later, since it could so easily be on the wrong risk, or the wrong way of addressing the risk, or would have been done anyway, or could have been done more easily using tools people later build etc. It is also quite plausible that using labour to produce growth now, or to point us in a better direction, could produce ten times as much value. It is thus pivotal to think carefully about when we want to have different kinds of labour.

I think that this overall picture is right and important. However, I should add some caveats. We might need to do some specialist research early on in order to gain information about whether the risk is credible or which parts to focus on, to better help us with course setting. Or we might need to do research early in order to give research on risk reduction enough academic credibility to attract a wealth of mainstream academic attention, thereby achieving vast growth in terms of the labour that will be spent on the research in the future. Some early object level research will also help with early fundraising and movement building — if things remain too abstract for a long time, it would be extremely difficult to maintain a movement. But in these examples, the overall picture is the same. If we want to do early object-level research, it is because of its instrumental effects on course setting, self-improvement, and growth.

The writing of this document and the thought that preceded it are an example of course setting: trying to significantly improve the value of the long-term effort in existential risk reduction by changing the direction we head in. I think there are considerable gains here and as with other course setting work, it is typically good to do it sooner. I’ve tried to outline the major systematic effects that make the value of our labour vary greatly with time, and to present them qualitatively. But perhaps there is a major effect I’ve missed, or perhaps some big gains by using quantitative models. I think that more research on this would be very valuable.

On ’causes’


Crossposted from the Global Priorities Project

This post has two distinct parts. The first explores the meanings that have been attached to the term ‘cause’, and suggests my preferred usage. The second makes use of these distinctions to clarify the claims I made in a recent post on the long-term effects of animal welfare improvements.

On the meaning of ‘cause’

There are at least two distinct concepts which could reasonably be labelled a ‘cause’:

  1. An intervention area, i.e. a cluster of interventions which are related and share some characteristics. It is often the case that improving our understanding of some intervention in this area will improve our understanding of the whole area. We can view different-sized clusters as broader or narrower causes in this sense. GiveWell has promoted this meaning. Examples might include: interventions to improve health in developing countries; interventions giving out leaflets to change behaviour.
  2. A goal, something we might devote resources towards optimising. Some causes in this sense might be useful instrumental sub-goals for other causes. For example, “minimise existential risk” may be a useful instrumental goal for the cause “make the long-term future flourish”. When 80,000 Hours discussed reasons to select a cause, they didn’t explicitly use this meaning, but many of their arguments relate to it. A cause of this type may be very close to one of the first type, but defined by its goal rather than its methods: for example, maximising the number of quality-adjusted life-years lived in developing countries. Similarly, one could think of a cause a problem one can work towards solving.

These two characteristics often appear together, so we don’t always need to distinguish. But they can come apart: we can have a goal without a good idea of what intervention area will best support that goal. On the other hand, one intervention area could be worthwhile for multiple different goals, and it may not be apparent what goal an intervention is supposed to be targeting. Below I explain how these concepts can diverge substantially.

Which is the better usage? Or should we be using the word for both meanings? (Indeed there may be other possible meanings, such as defining a cause by its beneficiaries, but I think these are the two most natural.) I am not sure about this and would be interested in comments from others towards finding the most natural community norm. Key questions are whether we need to distinguish the concepts, and if we do then which is the more frequently the useful one to think of, and what other names fit them well.

My personal inclination is that when the meanings coincide of course we can use the one word, and that when they come apart it is better to use the second. This is because I think conversations about choosing a cause are generally concerned with the second, and because I think that “intervention area” is a good alternate term for the first meaning, while we lack such good alternatives for the second.

Conclusions about animals

In a recent post I discussed why the long-term effects of animal welfare improvements in themselves are probably small. A question we danced around in the comments is whether this meant that animal welfare was not the best cause. Some felt it did not, because of various plausible routes to impact from animal welfare interventions. I was unsure because the argument did appear to show this, but the rebuttals were also compelling.

My confusion at least was stemming, at least in part, from the term ‘cause’ being overloaded.

Now that I see that more clearly I can explain exactly what I am and am not claiming.

In that post, I contrasted human welfare improvements, which have many significant indirect and long-run effects, with animal welfare improvements, which appear not to. That is not to say that interventions which improve animal welfare do not have these large long-run effects, but that the long-run effects of such interventions are enacted via shifts in the views of humans rather than directly via the welfare improvement.

I believe that the appropriate conclusion is that “improve animal welfare” is extremely unlikely to be the best simple proxy for the goal “make the long-term future flourish”. In particular, it is likely dominated by the proxy “increase empathy”. So we can say with confidence that improving animal welfare is not the best cause in the second sense (whereas it may still be a good intervention area). In contrast, we do not have similarly strong reasons to think “improve human welfare” is definitely not the best approach.

Two things I am not claiming:

  • That improving human welfare is a better instrumental sub-goal for improving the long-term future than improving animal welfare.
  • That interventions which improve animal welfare are not among the best available, if they also have other effects.

If you are not persuaded that it’s worth optimising for the long-term rather than the short-term, the argument won’t be convincing. If you are, though, I think you should not adopt animal welfare as a cause in the second sense. I am not arguing against ‘increasing empathy’ as possibly the top goal we can target (although I plan to look more deeply into making comparisons between this and other goals), and it may be that ‘increase vegetarianism’ is a useful way to increase empathy. But we should keep an open mind, and if we adopt ‘increasing empathy’ as a goal we should look for the best ways to do this, whether or not they relate to animal welfare.