“Red Teaming in Public” is a project, originally started by Nathan Labenz and Pablo Eder in June 2024. The goal is to catalyze a shift toward higher standards for AI developers.
Labenz shared the following details in the project’s announcement on X:
For context, we are pro-technology “AI Scouts” who believe in the immense potential of AI Our team consists of AI enthusiasts, app developers, chatbot jailbreakers, cybersecurity experts, LLM stress testers, one “gonzo journalist”, and at least one self-described “e/acc”
Our goal is to protect the public from abuse & the AI app industry from public backlash and heavy-handed government regulation We do NOT intend to police speech, nor to suggest that application developers should either Our concern is criminal or other egregious abuse
Our initial focus is on 3 classes of AI products: – Calling Agents – Coding Agents – Deep Fake Generators These products don’t simply generate content for users, but act upon or otherwise directly affect unsuspecting people
The risk of serious harm is real today, and this problem will grow as more powerful models like Llama-3 come online We hope to catalyze a shift toward higher standards for AI apps before a step change in model capability 10Xs both their utility and potential for abuse
Our plan is to: – Notify companies of our plans (see below) – Test a wide range of apps – Document failure modes & notable best practices – Report issues to developers privately before disclosing publicly – Raise awareness publicly – Hopefully help catalyze a “race to the top”
Throughout, we pledge to: – follow the law – avoid harm to innocent people – be transparent about our intentions and methods – share potential solutions with developers – collect and popularize notable best practices
This is an all-volunteer effort That could change in the future, but for now we are not fundraising, and we won’t be asking any of the developers whose apps we test for anything Any costs we incur (likely minimal) will be covered out-of-pocket by a few of the participants
The Midas Project is excited about efforts to collectively stress-test leading AI systems (we have recommended the same thing ourselves) and is optimistic about this new project.
To join Red Teaming in Public, you can apply via their Google form below.