Effective Altruism Forum
EA Forum

I claim that public information is very consistent with the investors hold an axe over the Trust; maybe the Trust will cause the Board to be slightly better or the investors will abrogate the Trust or the Trustees will loudly resign at some point; regardless, the Trust is very subordinate to the investors and won't be able to do much.

And if so, I think it's reasonable to describe the Trust as "maybe powerless."

Maybe Anthropic's Long-Term Benefit Trust is powerless

Zach Stein-Perlman3d2

Maybe. Note that they sometimes brag about how independent the Trust is and how some investors dislike it, e.g. Dario:

Every traditional investor who invests in Anthropic looks at this. Some of them are just like, whatever, you run your company how you want. Some of them are like, oh my god, this body of random people could move Anthropic in a direction that's totally contrary to shareholder value.

And I've never heard someone from Anthropic suggest this.

AISN #34: New Military AI Systems Plus, AI Labs Fail to Uphold Voluntary Commitments to UK AI Safety Institute, and New AI Policy Proposals in the US Senate

Zach Stein-Perlman25d6

I suspect the informal agreement was nothing more than the UK AI safety summit "safety testing" session, which is devoid of specific commitments.

Introducing AI Lab Watch

Zach Stein-Perlman1mo7

I agree such commitments are worth noticing and I hope OpenAI and other labs make such commitments in the future. But this commitment is not huge: it's just "20% of the compute we've secured to date" (in July 2023), to be used "over the next four years." It's unclear how much compute this is, and with compute use increasing exponentially it may be quite little in 2027. Possibly you have private information but based on public information the minimum consistent with the commitment is quite little.

It would be great if OpenAI or others committed 20% of their compute to safety! Even 5% would be nice.

AISN #34: New Military AI Systems Plus, AI Labs Fail to Uphold Voluntary Commitments to UK AI Safety Institute, and New AI Policy Proposals in the US Senate

Zach Stein-Perlman1mo12

In November, leading AI labs committed to sharing their models before deployment to be tested by the UK AI Safety Institute.

I suspect Politico hallucinated this / there was a game-of-telephone phenomenon. I haven't seen a good source on this commitment. (But I also haven't heard people at labs say "there was no such commitment.")

Introducing AI Lab Watch

Zach Stein-Perlman1mo2

The original goal involved getting attention. Weeks ago, I realized I was not on track to get attention. I launched without a sharp object-level goal but largely to get feedback to figure out whether to continue working on this project and what goals it should have.

Introducing AI Lab Watch

Zach Stein-Perlman1mo7

I share this impression. Unfortunately it's hard to capture the quality of labs' security with objective criteria based on public information. (I have disclaimers about this in 4-6 different places, including the homepage.) I'm extremely interested in suggestions for criteria that would capture the ways Google's security is good.

Introducing AI Lab Watch

Zach Stein-Perlman1mo2

Not necessarily. But:

There are opportunity costs and other tradeoffs involved in making the project better along public-attention dimensions.
The current version is bad at getting public attention; improving it and making it get 1000x public attention would still leave it with little; likely it's better to wait for a different project that's better positioned and more focused on getting public attention. And as I said, I expect such a project to appear soon.

Introducing AI Lab Watch

Zach Stein-Perlman1mo3

Yep. But in addition to being simpler, the version of this project optimized for getting attention has other differences:

Criteria are better justified, more widely agreeable, and less focused on x-risk
It's done—or at least endorsed and promoted—by a credible org
The scoring is done by legible experts and ideally according to a specific process

Even if I could do this, it would be effortful and costly and imperfect and there would be tradeoffs. I expect someone else will soon fill this niche pretty well.

Introducing AI Lab Watch

Zach Stein-Perlman1mo6

Yep, that's related to my "Give some third parties access to models to do model evals for dangerous capabilities" criterion. See here and here.
As I discuss here, it seems DeepMind shared super limited access with UKAISI (only access to a system with safety training + safety filters), so don't give them too much credit.
I suspect Politico is wrong and the labs never committed to give early access to UKAISI. (I know you didn't assert that they committed that.)

Zach Stein-Perlman

Bio

Participation1

Posts 45

Comments439

Topic contributions1

Participation
1

Posts
45

Comments
439

Topic contributions
1