Agustín Covarrubias

AI Safety Group Support Lead @ Centre for Effective Altruism
1299 karmaJoined Pursuing an undergraduate degreeWorking (0-5 years)Santiago, Santiago Metropolitan Region, Chile



I’m a generalist and open sourcerer that does a bit of everything, but perhaps nothing particularly well. I'm currently the AI Safety Group Support Lead at CEA.

I was previously a Software Engineer in the Worldview Investigations Team at Rethink Priorities.


Sorted by New


Topic contributions

I think this is fine: Epoch's work appeals to a broad audience, and Nat Friedman is a well-respected technologist.

I read your post while I was writing up the wiki article on Shapley values and thought it was really useful. Thanks for making that post!

Quick poll [✅ / ❌]: Do you feel like you don't have a good grasp of Shapley values, despite wanting to? 

(Context for after voting: I'm trying to figure out if more explainers of this would be helpful. I still feel confused about some of its implications, despite having spent significant time trying to understand it)

Can anyone who is more informed on NIST comment on whether high-quality comments tend to be taken into account? Are drafts open for comments often revised substantially in this way?

This is just a top-notch post. I love to see detailed analyses like this. Props.

At last, a biblically accurate Qualy The Lightbulb

[Opinion exclusively my own]

I think this framing has a lot of value. Funnily enough, I've heard tales of groups like this from the early days of EA groups, when people were just figuring things out, and this pattern would sometimes pop up.

I do want to push back a little bit on this:

But before deferring, I think it's important to realize that you're deferring, and to make sure that you understand and trust who or what you're deferring to (and perhaps to first have an independent impression). Many intro fellowship curricula (eg the EA handbook) come across more as manifestos than research agendas—and are often presented as an ideology, not as evidence that we can use to make progress on our research question.

The EA handbook (which nowadays is what the vast majority of groups use for their intro fellowships) includes three “principles-first” weeks (weeks 1, 2, and 7), which are meant to help participants develop their own opinions with the help of only the basic EA tools or concepts.

Furthermore, week 7 (“What do you think”) includes a reading of independent impressions, and learning that concept (and discussing where EA might be wrong) is one of the key objectives of the week:

A key concept for this session is the importance of forming independent impressions. In the long run, you’re likely to gain a deeper understanding of important issues if you think through the arguments for yourself. But (since you can’t reason through everything) it can still sometimes make sense to defer to others when you’re making decisions.

In the past, a lot of work has been put in trying to calibrate how “principles-based” or “cause-oriented” intro fellowships should be, and I think the tradeoff can be steep for university groups since people can get rapidly disenchanted by abstract philosophical discussion about cause prioritization (as you mention). This can also lead to people treating EA as a purely intellectual exercise, instead of thinking of concrete ways in which EA ideas should (for example) change their career plans.

That said, I think there are other ways in which we could push groups further along in this direction, for example:

  • We could create programming (like fellowships or workshops) around cause prioritization, exploring different frameworks and tools in the field. Not just giving a taste of these frameworks (like the handbook does), but also teaching hands-on skills that participants can use for actual cause prioritization research.
  • We could push for more discussion centered around exploratory cause research, for example, by creating socials or events in which participants try to come up with cause candidates and do some preliminary research on how promising they are (i.e. a “cause exploration hackathon”).

I know there has been some previous work in this direction. For example, there's this workshop template, or this fellowship on global priorities research. But I think we don't yet have a killer demo, and I would be excited about groups experimenting with something like this.

X-Risk sentiment in the audience: at one point in the debate, one participant asked the audience who thought  AI was an existential risk. From memory, around 2/3s of students put up their hands.

Do you have a rough sense of how many of these had interacted with AI Safety programming/content from your group? Like, was a substantial part of the audience just members from your group who had heard EA arguments about AIS?

Load more