Skip to content
@METR

METR

Model Evaluation and Threat Research

Popular repositories Loading

  1. eval-analysis-public eval-analysis-public Public

    Public repository containing METR's DVC pipeline for eval data analysis

    Python 246 46

  2. task-standard task-standard Public

    METR Task Standard

    TypeScript 178 36

  3. vivaria vivaria Public

    Vivaria is METR's tool for running evaluations and conducting agent elicitation research.

    TypeScript 135 38

  4. RE-Bench RE-Bench Public

    Python 134 18

  5. public-tasks public-tasks Public

    HTML 121 19

  6. inspect-action inspect-action Public

    Running UK AISI's Inspect in the Cloud

    Python 23 10

Repositories

Showing 10 of 57 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…