Coding AIs Are Flopping at Debugging?!

Plus, How to create thumbnails that actually converts.

In partnership with

Hello and welcome to the Automated, your AI tour guide.

Remember when AI was supposed to replace devs? Yeah, not so fast.

Microsoft just gave AI coding a reality check, and spoiler alert: it’s not pretty.

Let’s talk about why these "coding AIs" still need a lot of work — and why human devs are still in the driver’s seat—for now.

Here's what we have for you today

🤖 Microsoft’s Critical Take on AI Coding

Okay, so you know how everyone’s been losing their minds over AI writing code?

Like, Sundar Pichai straight-up said that 25% of Google’s new code is now AI-generated. And Mark Zuckerberg? He’s already dreaming of an AI coding empire at Meta.

Well, reality just smacked that hype right in the face.

Microsoft Research decided to actually test how good these AI models are at fixing real bugs... and let’s just say, it’s giving C+ student trying their best energy. 

Here’s what went down:

They grabbed fancy models like Anthropic’s Claude 3.7 Sonnet and OpenAI’s o3-mini, then threw 300 debugging tasks at them from a benchmark called SWE-bench Lite.

Oh, and they even gave them access to debugging tools like Python debuggers. (So yeah, they had help.)

But the results?

Honestly, kind of tragic:

  • Claude 3.7 Sonnet was the best kid in class, but even it only fixed about 48% of the bugs.

  • OpenAI’s o1 pulled a 30.2%.

  • o3-mini limped in at 22.1%.

In short: even the "smartest" models barely passed — and most wouldn’t even get hired for an entry-level dev job.

So why are they flopping?

  • A lot of models couldn’t even properly use the tools they were given. (It’s like handing someone a hammer and they still try fixing a wall with a spoon.)

  • But the bigger issue? Data scarcity. There’s just not enough training data showing how humans actually debug step-by-step. (AI: “How do you fix a bug?” Humans: “Well first, you cry a little. Then you Google it.”)

The researchers think if we train models with better "human debugging journey" data — like actual messy trial-and-error sessions — they might actually level up.

So what does this all mean?

  • AI isn’t ready to replace real developers.

  • It’s a helpful sidekick, not a superhero. (Think Robin, not Batman.)

  • Companies should probably chill a little on the “AI will code everything” fantasy.

Also, don’t lose sleep over your coding career — tech legends like Bill Gates, Replit’s CEO, and others are all saying human devs aren’t going anywhere.

And honestly... who else is going to clean up after these messy AI bugs?

You can catch the full details [here].

Our friends over at @MarketingAlec are building something pretty awesome — a 10k+ marketing community geeking out on AI tools and prompts that actually work (think 40%+ better performance).

They drop fresh tips every Wednesday and go deep every Friday. If you’re into working smarter, not harder, it’s definitely worth checking out.

👉 Take a look and join the crew at MarketingAlec.

Subscribe now and level up your marketing skills!

Sponsored
MarketingAlecSkip the AI hype, get real results. Join 10,387+ marketers learning the AI tools and prompts that drove 40% better performance. Fresh tips Wed, deep dives Fri.

Find out why 1M+ professionals read Superhuman AI daily.

In 2 years you will be working for AI

Or an AI will be working for you

Here's how you can future-proof yourself:

  1. Join the Superhuman AI newsletter – read by 1M+ people at top companies

  2. Master AI tools, tutorials, and news in just 3 minutes a day

  3. Become 10X more productive using AI

Join 1,000,000+ pros at companies like Google, Meta, and Amazon that are using AI to get ahead.

 🦾 ChatGPT’s Memory Upgrade! 

OpenAI just dropped something new in ChatGPT, and it’s kind of a game changer.

Starting Thursday, they're rolling out a memory upgrade that lets ChatGPT actually remember (or reference) stuff from past chats to make its answers more personal and on-point.

Here’s what’s up:

  • The Feature: It’s called “reference chat history,” and it’ll help ChatGPT tailor its answers based on what you’ve talked about before. Whether you’re chatting, seeking advice, or generating images, ChatGPT will now remember your past chats.

  • Who’s Getting It: This is rolling out first to ChatGPT Pro and Plus subscribers (sorry free users). But, if you’re in the U.K., EU, or certain other countries like Iceland, Liechtenstein, Norway, and Switzerland, you're gonna have to wait a bit — OpenAI’s still working through the legal stuff there.

  • The Goal: OpenAI’s aiming to make conversations with ChatGPT more fluid and relevant. It’s like ChatGPT actually remembering you — your preferences, your previous topics, and your vibe — so no more explaining things all over again.

Not everyone’s cool with AI holding onto their info, though. That’s why OpenAI has added an opt-out option.

You can turn off the memory feature anytime in the settings, manage your saved memories, or even ask ChatGPT what it remembers.

And if you’re really not into the whole memory thing, you can opt for a Temporary Chat (a kind of ChatGPT's version of incognito mode) where nothing gets stored.

The whole point is seamless interactions, where you no longer need to manually prompt ChatGPT to remember or forget. If you’ve enabled the memory option, the feature will kick in automatically.

But if you’re still unsure, click here to learn more about how to use and manage this new feature.

🎁 Your Free Gift Is Waiting…

Remember that sweet free gift we mentioned? It’s still up for grabs!

All you have to do is refer a friend—just one, or two—and it’s all yours.

Sounds like a win-win, right? Hurry up before time runs out and we’ll see you on the winner’s side! 😉

🧱Around The AI Block

  • Focus Buddy updates your to-do list in real-time, checks in to help you overcome procrastination, and provides weekly insights on your work habits.

  • Tattoon is an app that lets you demo what a tattoo would look like on your body to see if you like it.

  • Strella conducts customer interviews so you can gather their insights.

  • Command AI guides your users through your product with interactive tours, personalized nudges, and an assistant that demos features.

  • Anam lets you add lifelike digital humans to your product that can chat with customers in 32 languages.

🤖 ChatGPT Prompt Of The Day: Prompt for Promotional Video Thumbnails.

First impressions matter, especially when it comes to video content.

A compelling thumbnail can make all the difference in grabbing attention and driving engagement.

Here's a prompt to help you create eye-catching thumbnails that’ll stand out on platforms like YouTube and social media..

See this as an AI WoD!

Generate a compelling thumbnail for a promotional video showcasing [specify the video content or campaign theme], designed to attract clicks and drive engagement on platforms like YouTube or social media.

That's all we've got for you today.

Did you like today's content? We'd love to hear from you! Please share your thoughts on our content below👇

What'd you think of today's email?

Login or Subscribe to participate in polls.

Your feedback means a lot to us and helps improve the quality of our newsletter.

🚀 Want your daily AI workout?

Premium members get daily video prompts, premium newsletter, an no-ad experience - and more!

Already a paying subscriber? Sign In.

Premium members get::

  • • 📽️ Get the daily AI WoD
  • • ✅ Priority help with AI Troubleshooter
  • • ✅ Thursday premium newsletter
  • • ✅ No ad experience
  • • ✅ and more....

Reply

or to participate.