Anthropic • The Midas Project

On October 15, Anthropic released an updated version of their Responsible Scaling Policy. From their announcement:

This update introduces a more flexible and nuanced approach to assessing and managing AI risks while maintaining our commitment not to train or deploy models unless we have implemented adequate safeguards. Key improvements include new capability thresholds to indicate when we should upgrade our safeguards, refined processes for evaluating model capabilities and the adequacy of our safeguards (inspired by safety case methodologies), and new measures for internal governance and external input.

Announcement Blog Post

Changelog