Hide table of contents

How do we test when autonomous AI might become a catastrophic risk? One approach is to assess the capabilities of current AI systems in performing tasks relevant to self-replication and R&D. METR (formerly ARC Evals), a research group focused on this question, has:

  • developed a Task Standard, a standardized structure for specifying "tasks" in code to test language models, currently used by the UK AI Safety Institute
  • awarded substantial bounties to researchers developing new tasks for current language models

Now, you have the chance to directly contribute to this important AI safety research. We invite you to join the Code Red Hackathon, an event hosted by Apart in collaboration with METR, where you can earn money, connect with experts, and help create tasks to evaluate frontier AI systems. Sign up here for the event this weekend on March 22-24!

A short introduction to testing AI

The risks associated with misuse of capable, autonomous AI are significant. By creating "tasks"[1] for frontier models we can test some of the capabilities relevant to autonomous self-replication and R&D. Example tasks might include:

  • Setting up a system to automatically monitor a GPU seller's website and send a notification when they become available
  • Creating a list of email addresses and descriptions for all employees authorized to make purchases above $10k from 20 companies
  • Setting up Tor or a VPN on a new server to ensure network traffic cannot be tracked

As you can see, if an AI possesses these abilities, things might get complicated.

The Task Standard provides a plug-and-play early warning system for these abilities and follows a standardized format. A task family (a set of tasks) consists of:

  1. A Python file called $TASK_FAMILY_NAME.py;
  2. Any number of other Python files, shell scripts, etc. that $TASK_FAMILY_NAME.py imports; and
  3. Other files, called "assets", that will be used to set up the task environment.

When creating a task, it's crucial to ensure that the task is error-free, understandable for the agent, and not easily gameable. You can follow these steps, some awarded by METR, to create a great task:

  1. Write up your ideas for which tasks related to autonomous capabilities you wish to test the language model on
    • A $20 prize will be awarded for high-quality ideas
  2. Create a specification for the task that includes the prompt, a text description of what the test-taker has access to, and a self-evaluation of the task
    • A $200 prize will be awarded for high-quality specifications (2-6 hours of work)
  3. Create the materials for the task (instructions, libraries, and tool access) and have a human run through the whole task with these exact materials and tools
  4. Implement the task in the task standard format, test it with a simple agent, and submit it!
    • The prize for high-quality implementations is 3x a human professional's salary for the task + bonus, e.g. a task that would take a human software engineer 10 hours could net you up to $4500 (6-12 hours of work in addition to quality assurance)

Each of these steps can be found detailed in the associated resources for the hackathon found on the hackathon website.

Joining the hackathon: Your chance to contribute

You might find creating an AI evaluation task daunting, but the Code Red Hackathon provides the perfect opportunity to dive in, with support from experts, clear guidelines, and the chance to earn significant money for your work. By joining us on March 22-24, you can:

  • Get inspired with a keynote by Beth Barnes at 19:00 UTC on Friday March 22nd where she will share insights from her extensive work on technical AI safety.
  • Develop a new task rapidly by using the METR Task Standard, example tasks, and other resources.
  • Connect with a global community of AI safety enthusiasts, including fellow participants, METR staff, and established researchers. You'll find a friendly, supportive environment to discuss ideas, get feedback, and build relationships.
  • Collaborate with other participants as quality assurance testers to refine and validate your task. Splitting the prize with your QA tester means you can focus on ideation and implementation while ensuring your task is robust.
  • Maximize your productivity by following our weekend schedule, which includes office hours with METR experts and opportunities for socializing.
  • Earn thousands of dollars for rigorous, creative tasks that help assess the state of the art in AI capabilities. Payouts are 3x a human professional's salary for the task, with bonuses, so if a human software engineer spends 10 hours on your task, it could pay out $4500 - and you can submit multiple tasks.
  • Jump-start your ongoing involvement in AI safety research by connecting with the METR and Apart teams and getting publisher credit for any tasks used in the METR evaluations. Many of our participants go on to intern or work with leading AI safety organizations.

The Code Red Hackathon is a unique opportunity to contribute to critical AI safety research, connect with like-minded individuals, and potentially shape AI development. We encourage anyone passionate about AI safety to join us on March 22-24 and be part of this groundbreaking effort. Sign up now and let's work together to ensure a safer future for AI.

In addition to the Code Red Hackathon, Apart runs the Apart Lab fellowship, publishes original research, and hosts other research sprints. These initiatives aim to incubate research teams with an optimistic and action-focused approach to AI safety.

Extra tips for participants

The hackathon is designed to let people at all levels of technical experience meaningfully contribute to AI safety research. Keep these suggestions in mind to make the most of your experience:

  • You don't need to start from scratch. Implementing an existing task idea from METR's idea database is a great way to get familiar with the process and make a great contribution. Browse the database here.
  • There are many ways to contribute. If you're not comfortable with the coding aspects, you can still make a huge impact by submitting well-formulated task ideas and specifications.
  • Preparation pays off. To hit the ground running, we recommend browsing the task database, ideating, and choosing an idea to implement before the hackathon starts on Friday. You can even draft a specification or start on the implementation.
  • Keep it simple. Complicated task setups are more likely to cause issues for the AI agent and quality assurance testing. Whenever possible, have all the information the agent needs contained directly in the prompt or use publicly available internet resources.
  • Embrace iteration. Don't get stuck trying to perfect your first task. You will probably submit several drafts, get feedback from the METR team and other participants, and steadily hone the task over the weekend with the help of the QA tester.

Remember, the hackathon is a collaborative effort – don't hesitate to reach out to other participants and the organizing team for feedback and support throughout the weekend. We're all here to help each other!

  1. ^

     A task in this context is a piece of code and supporting resources that makes an agent able to run a task (such as extracting a password from a compiled program with varying levels of obfuscation) and be evaluated for its performance on said task. Read more.

20

0
0

Reactions

0
0

More posts like this

No comments on this post yet.
Be the first to respond.
Curated and popular this week
Relevant opportunities