• The Automated
  • Posts
  • OpenAI’s Trying (Really Hard) to Clean Up AI Testing.

OpenAI’s Trying (Really Hard) to Clean Up AI Testing.

Plus, master the art of editing and perfecting your content with AI!

In partnership with

Hello and welcome to the Automated, your AI tour guide.

You know how no one can agree on what makes one AI “better” than another?

Well, OpenAI’s had enough — they’re stepping in to fix it. They’ve even dropped a full-on plan to clean up AI testing once and for all.

Spoiler: It involves startups, real-world challenges, and yep, a bit of controversy.

Let’s break it down:

Here's what we have for you today

😲 OpenAI’s New Plan To Fix AI Testing Forever!

Okay, so real talk: OpenAI just came out and said what a lot of us have been thinking but no one wanted to say out loud — most AI benchmarks out there are basically trash. 

Not in those exact words, obviously, but the vibe was loud and clear.

They’re launching something new called the Pioneers Program because, apparently, nobody really knows how to properly measure what makes one AI model better than another anymore.

And honestly? Fair point. 

Right now, the popular benchmarks are either ridiculously niche (like, “Can your AI solve a PhD-level math problem in Latin?”) or just completely disconnected from what actual people care about.

Plus, a bunch of these tests can be gamed, which makes the results kinda... sus.

Really, It’s like grading a chef based on how fast they can solve a Rubik’s Cube — cool party trick, but completely useless when you’re starving.

So OpenAI’s solution? Team up with startups across industries like healthcare, law, insurance, finance — you know, the big leagues.

The goal is to build real-world evaluations that actually test how AI performs in high-stakes environments. Not just some theoretical IQ test for bots.

They’re kicking things off by partnering with a handful of startups — this first group will help set the new gold standard for what “good AI” should actually look like.

And bonus: these companies get to work directly with OpenAI’s team to fine-tune models for specific tasks (a.k.a. making AI not just smart, but actually useful).

But here’s the thing— not everyone’s gonna be clapping for this.

Some folks are already side-eyeing the fact that OpenAI — who, btw, would benefit a lot from these new benchmarks — is also the one creating them. It’s giving major “player and referee” energy, and people are wondering if that’s crossing an ethical line.

Like… are we really sure these benchmarks won’t be conveniently rigged to make OpenAI’s own models look like total rockstars?

Still, if it works, it could actually be a huge deal. AI desperately needs more practical, transparent standards — and OpenAI’s basically saying, “Fine, we’ll do it ourselves.”

So yeah. This either becomes the thing that finally makes AI testing make sense... or it blows up into another messy AI drama. 

Either way, pass the popcorn.

Our friends over at @MarketingAlec are building something pretty awesome — a 10k+ marketing community geeking out on AI tools and prompts that actually work (think 40%+ better performance).

They drop fresh tips every Wednesday and go deep every Friday. If you’re into working smarter, not harder, it’s definitely worth checking out.

👉 Take a look and join the crew at MarketingAlec.

Subscribe now and level up your marketing skills!

Sponsored
MarketingAlecSkip the AI hype, get real results. Join 10,387+ marketers learning the AI tools and prompts that drove 40% better performance. Fresh tips Wed, deep dives Fri.

Find out why 1M+ professionals read Superhuman AI daily.

AI won't take over the world. People who know how to use AI will.

Here's how to stay ahead with AI:

  1. Sign up for Superhuman AI. The AI newsletter read by 1M+ pros.

  2. Master AI tools, tutorials, and news in just 3 minutes a day.

  3. Become 10X more productive using AI.

 🥊 The Ultimate AI Showdown.

We’ve got our popcorn out, and you should too, because the drama between OpenAI and its estranged co-founder, Elon Musk, is far from over.

In a recent filing, OpenAI and CEO Sam Altman fired back at Musk with a countersuit, demanding he be stopped from causing more “unlawful and unfair” damage.

They argue that Musk’s attacks have already done significant harm, and if he continues, things could get even worse for OpenAI’s mission and its crucial relationships.

OpenAI’s legal team claims Musk’s recent moves — including a “fake takeover bid” — are part of a strategy to disrupt the company’s future.

Basically, their message to musk is cut it out, or bigger problems are coming.

Musk, on the other hand, hasn’t responded directly yet, but his lawsuit accuses OpenAI of abandoning its original nonprofit mission to benefit humanity.

For those of you who are a bit behind on the details, here’s the rundown:

  • OpenAI started as a nonprofit in 2015, shifted to a “capped-profit” structure in 2019, and now plans to restructure again into a public benefit corporation.

  • But Musk has been the stumbling block, adamantly opposing this shift and trying to block it at every turn.

In fact, he even sought a preliminary injunction to stop OpenAI’s transition to for-profit status.

A judge shot that down in March, but the case is still set for a jury trial in spring 2026.

Amid all this, a coalition of groups (including labor unions and nonprofits) has urged California’s Attorney General to intervene and block OpenAI’s transition, arguing the company is no longer living up to its charitable mission and thereby putting AI safety at risk.

For OpenAI, the stakes are sky-high. They need to wrap up their for-profit conversion by 2025 or risk losing some of the capital they’ve raised.

In light of that, OpenAI is fighting back, insisting that their nonprofit arm will remain — just better funded and primed to tackle major issues like healthcare, education, and science.

Oh, and they didn’t stop there. They threw some serious shade at Musk, claiming, "Elon’s never been about the mission but has always had his own agenda."

As we said before, this one is far from over. Stay tuned, because we’re heading into 2026 with a full-blown AI showdown.

🧱Around The AI Block

  • ForceField validates your digital content to prove it hasn't been deepfaked or tampered with.

  • Storm turns your research topics into ready-to-read Wikipedia articles in seconds.

  • APIPark lets you manage all your AI services and APIs through a single gateway, making integration and deployment dramatically simpler.

  • ReadKidz helps you create children's picture books, stories and songs with AI-generated ideas, custom storyboards, and character designs.

  • ArcadeAI lets you design custom jewelry and gets skilled artisans to handcraft it for you.

🤖 ChatGPT Prompt Of The Day: Prompts for Editing And Refining.

Editing and refining are crucial steps in making your content clear and concise.

These prompts will guide you through improving your writing, whether it’s eliminating passive voice, adjusting readability, or trimming down word count without losing your message.

See this as an AI WoD!

Fix your passive voice: “Find and fix passive voice in this passage while retaining the meaning.”

Change the reading level: “Rewrite this at a grade six reading level.” 

Cut the word count: “This should only be 800 words. Refine the text without removing any important content.”

That's all we've got for you today.

Did you like today's content? We'd love to hear from you! Please share your thoughts on our content below👇

What'd you think of today's email?

Login or Subscribe to participate in polls.

Your feedback means a lot to us and helps improve the quality of our newsletter.

🚀 Want your daily AI workout?

Premium members get daily video prompts, premium newsletter, an no-ad experience - and more!

Already a paying subscriber? Sign In.

Premium members get::

  • • 📽️ Get the daily AI WoD
  • • ✅ Priority help with AI Troubleshooter
  • • ✅ Thursday premium newsletter
  • • ✅ No ad experience
  • • ✅ and more....

Reply

or to participate.