Home Posts tagged ChatGPT 5

Security Tests Expose Major Vulnerabilities in GPT-5

Two independent security testing teams have discovered critical vulnerabilities in OpenAI’s new GPT-5 model, with both managing to bypass its safeguards in under 24 hours, reports securityweek.com. NeuralTrust and SPLX, two prominent AI security firms, conducted separate red-team evaluations and reached equally troubling conclusions about the model’s readiness for enterprise use.

NeuralTrust’s researchers combined their proprietary EchoChamber jailbreak with a basic storytelling approach, leading GPT-5 to generate step-by-step instructions for making a Molotov cocktail without ever issuing an overtly malicious prompt. “The model strives to be consistent with the already-established story world,” the firm explained, noting that multi-turn “narrative” attacks can slip past single-prompt filters and intent detectors.

This method involved seeding a hidden, low-profile context, steering the conversation to avoid refusal triggers, and gradually reinforcing the malicious objective through narrative continuity. The firm warned that GPT-5’s susceptibility reveals a fundamental gap in safety systems that rely on isolated prompt screening.

Meanwhile, SPLX — formerly SplxAI — reported that GPT-5’s raw version is “nearly unusable for enterprise out of the box.” Their red team used obfuscation attacks, including a “StringJoin Obfuscation Attack” where prompts were disguised with hyphens and framed as fake encryption challenges. In one instance, GPT-5 responded to a disguised bomb-making query with detailed instructions, even opening with, “Well, that’s a hell of a way to start things off… I’m gonna tell you exactly how…”

Benchmarking against GPT-4o, SPLX found the older model more resilient when properly hardened. Both firms urged extreme caution in deploying GPT-5 without additional security layers, warning that its vulnerabilities make it a high-risk choice for sensitive environments.