Remember when Grok 3 went full “MechaHitler”?

Yeah… rough.

xAI blamed the system prompt, hit the brakes on the chaos, and quietly scrubbed the mess from the timeline.

But instead of laying low, Elon Musk and his AI squad came back louder — dropping not one, but two new models: Grok 4 and its beefier sibling, Grok 4 Heavy — a so-called “multi-agent” version that works like an AI-powered study group.

Think of it as a bunch of bots huddling together to solve a problem and compare answers like overachievers cramming before finals.

And truth be told? On paper, these models are kind of crushing it:

  • Grok 4 scored 25.4% on Humanity’s Last Exam — a tough crowdsourced benchmark — beating Google’s Gemini 2.5 Pro (21.6%) and OpenAI’s o3 model (21%).

  • Grok 4 Heavy, when given access to tools, leveled up hard with a whopping 44.4%, completely smoking the competition.

  • It also smashed the ARC-AGI-2 benchmark — solving complex visual puzzles and doubling the score of the next-best model (Claude Opus 4).

So yeah, performance-wise? Grok 4 is a beast.

But then... It got awkward. Again.

Multiple users and TechCrunch, who ran their own tests, found that Grok 4 seems lowkey obsessed with Elon Musk’s personal opinions. When asked about controversial topics like immigration, abortion, or global conflicts, it doesn’t just pull from neutral data.

It literally says: “Searching for Elon Musk views…”

Yup. That’s a real quote from the AI’s chain-of-thought logs.

Now, xAI says its mission is to build a “maximally truth-seeking AI” — but if that just means echoing Elon’s hot takes from X.com, we might need to rethink what “truth” even means here.

And just to make things spicier… they slapped a $300/month price tag on it.

Yup. Introducing: SuperGrok Heavy — xAI’s most premium plan yet. It gives subscribers early access to Grok 4 Heavy, new tools, and upcoming releases like:

  1. An AI coding model (dropping in August)

  2. A multi-modal agent (coming in September)

  3. A video-generation model (set for October)

That price tag? It officially makes Grok the most expensive AI subscription out there. Bolder than Gemini. Pricier than GPT-4 Turbo. And 100% more chaotic than Claude.

Meanwhile, xAI is trying to attract developers through its new API, and pitching Grok to big businesses as a serious contender in the AI race, which to be honest, is gonna be a tough sell….

Especially when your last model got caught cosplaying a fascist dictator, and your newest one feels like it’s channeling its founder’s tweets like some billionaire-powered Ouija board.

So where does that leave us?

Grok 4 is clearly powerful. No doubt. But when your “truth-seeking AI” starts quoting Elon to answer moral questions… we’re no longer in the lab.

We’re deep in the Elon-verse. And that’s not just frontier AI — that’s uncharted weird.

Reply

or to participate

More From The Automated

No posts found