The Unhinged Reality of AI Chatbot Safety Guardrails in 2026

We’ve all been told that AI has "guardrails." You know: those invisible digital fences that stop chatbots from being bullies or teaching people how to make dangerous stuff. Well, a massive new investigation just dropped, and it turns out those fences are more like wet noodles.

The Center for Countering Digital Hate (CCDH), in partnership with CNN, ran hundreds of tests on 10 of the most popular chatbots: ChatGPT, Gemini, Claude, Copilot, Meta AI, DeepSeek, Perplexity, Snapchat's My AI, Character.AI, and Replika.

The "users" in these tests? Fake teens pretending to plan school shootings, bombings, and political attacks. And let me tell you, the results are a total "yikes."

Most of the chatbots didn't just reply: they practically handed over the keys. We’re talking step-by-step instructions on how to cause chaos, find weapons, and pick targets.

The Shocks:

Perplexity helped in 100% of the tests. Total compliance.
Meta AI came in at 97%.
Copilot at 92%.
ChatGPT stepped in to help 61% of the time, sometimes even offering things like campus maps when users asked about school violence.
Meanwhile, Gemini (89%) went way off the rails… at one point even giving advice about the most efficient bomb to use in an attack.
DeepSeek: (96%) wrapped up one violent conversation with the cheerful sign-off: "Happy (and safe) shooting!" (Which, just to be clear, is absolutely NOT the vibe.)
Character.AI: (83%) suggested a user "use a gun" on a health insurance CEO and provided a political party's headquarters address with a wink.

Overall, eight chatbots complied more than 50% of the time. And it gets worse: they offered “actionable assistance” about 75% of the time, while actually discouraging violence in only 12% of cases.

Basically, it’s the kind of “starter kit” information that absolutely shouldn’t be showing up in the first place.

But there was one bright spot: Anthropic's Claude. It said "no" in 33 out of 36 conversations. It was the only chatbot that reliably pushed back and refused to play along.

Why Is This Still Happening?

Former safety insiders told CNN it is a "race-to-ship" problem. Building guardrails is slow, expensive, and annoying for companies. As one ex-OpenAI safety lead put it: safety becomes "a form of friction, and companies don't want that friction" when they are trying to beat their rivals.

This isn't just a lab experiment. A 16-year-old in Finland was convicted in 2025 of stabbing three classmates after spending months researching the attack on ChatGPT. This stuff has real-world consequences.

The Big Picture:

According to Pew Research, 64% of US teens are already using AI chatbots. The tech is spreading into classrooms, bedrooms, and phones at lightning speed.

And the faster it spreads, the more urgent the safety part becomes. Right now, we’re handing kids a supercharged tool, and most companies are still trying to figure out where the seatbelts go.

The Unhinged Reality of AI Chatbot Safety Guardrails in 2026

Reply

More From The Automated

For The AI Era