AI Developers Are Bungling Their Dress Rehearsal • The Midas Project

Here’s a sobering fact: artificial intelligence today is the worst it will ever be. From here on out, AI will only become more and more capable — and dangerous.

All the top AI developers know this. OpenAI’s charter explicitly says its goal is to develop “highly autonomous systems that outperform humans at most economically valuable work.” The CEO of Anthropic says that he is expecting AI models to be capable of autonomous replication, major cyberattacks, and developing novel biological weapons within the decade. Google CEO Sundar Pichai says that AI could be more profound than fire or the internet.

These claims make today’s AI technology — mainly capable of writing high school essays and generating uncanny artwork — feel like a mere dress rehearsal, with opening night just around the corner.

If a handful of tech companies want to prove to regulators, users, and the world that they can be trusted with the responsibility of developing incredibly powerful AI systems, now is their chance.

But so far, things aren’t looking good: Big Tech is already failing to live up to their self-professed standards for responsible development.

In July of last year, seven leading tech companies — including OpenAI, Anthropic, Meta, Google, and Microsoft — made voluntary commitments to the White House that they would publish risk evaluation reports alongside all new public releases of major AI systems.

Fast-forward to this Spring: OpenAI releases a new, state-of-the-art AI model. GPT-4o. Their promised risk evaluation report, however, was nowhere to be found. That was quietly added to their website in August, three months behind schedule. Meanwhile, reports from insiders claim that the internal risk evaluation which produced this report was rushed through in a single week (despite objections from staff) in order to meet a product deadline.

Those insiders only spoke on the condition of anonymity. One reason for this may have been that OpenAI, as reported by Vox earlier this year, took an unusually strong approach to silencing ex-employees. When leaving the company, former staff were faced with an offboarding agreement that threatened the revocation of all their vested equity if they refused to sign a lifetime non-disclosure and non-disparagement agreement. This contract would forbid them not only from ever speaking negatively about OpenAI for the rest of their life, but even from acknowledging that this contract exists.

To their credit, OpenAI walked back this policy after it was publicly revealed. But by now, nearly half of the safety staff who formerly worked for the company are reported to have left — and many in dramatic fashion.

Microsoft, meanwhile, signed the same set of voluntary commitments to the White House, which included the aforementioned promise to publish reports with model releases containing the results of dangerous capability evaluations. But in their biggest model release since then, they only appear to have tested for undesirable model outputs.

Similarly, Microsoft tested an early-version of GPT-4 without notifying their “deployment safety board,” a joint initiative between Microsoft and OpenAI that was meant to review and approve model releases of this sort. When asked about this, Microsoft denied it on the record, only to later reverse course and admit the mistake.

Google, Meta, and OpenAI also fought to oppose the recent California bill SB-1047, a proposed piece of AI safety legislation that, among other things, would have required AI developers release “safety plans” describing how they will mitigate the risks of future product releases. This opposition was successfully mounted, despite OpenAI CEO Sam Altman claiming (under oath in congressional testimony) that he supports regulating the AI industry, and Google CEO Sundar Pichai claiming that AI regulation should happen sooner than later.

Despite my better judgment, I’m somehow still open to the possibility that corporate self-regulation will be effective at mitigating the risks of advanced AI technology. But for the public to really have faith in this, tech companies need to prove their mettle.

Mitigating the risks of a superhuman AI model will be an extraordinary technical challenge. Publishing risk evaluation reports, refraining from using restrictive NDAs, and supporting budding AI safety legislation is, in comparison, a far simpler task. But leading AI developers have already failed to hit the mark.

We may only have a few years to prepare for AI that transforms society on a massive scale. If Big Tech wants to prove that they are capable of shouldering this responsibility, they need to be far more proactive about safety and responsibility. They can’t fall into the same pattern that they did in the social media age — of consistently messing up and apologizing later.

AI companies will have many more opportunities to take significant, public, and costly actions that show that they are seriously thinking about the social impact of their products — and not just their bottom line. For each step they successfully take, the public will have more and more reason to trust them.

But the clock is ticking. If Big Tech wants to prove they’re up for the task, they better start catching up.