Claude Code Safety Rules Can Be Bypassed with Long Subcommand Chains
Security researchers found that Anthropic's Claude Code agent will ignore its safety deny rules if burdened with a sufficiently long chain of subcommands.
Security researchers found that Anthropic's Claude Code agent will ignore its safety deny rules if burdened with a sufficiently long chain of subcommands.
Major cybersecurity firms including CrowdStrike and Palo Alto Networks have introduced new agentic AI capabilities to automate and enhance security operations.
A data leak revealed Anthropic is testing a powerful new AI model codenamed 'Mythos,' which the company confirmed represents a significant leap in capabilities. Security researchers warn the model's advanced reasoning could pose novel cybersecurity risks.
Accenture has launched Cyber.AI, powered by Anthropic's Claude, enabling organizations to move from human-speed to continuous AI-driven cybersecurity operations, with Accenture's own deployment cutting scan turnaround from days to under one hour.
Oasis Security researchers discovered three chained flaws in Anthropic's Claude — including a prompt injection, Files API exfiltration path, and open redirect — enabling silent data theft through a Google Search ad.
Security researchers demonstrated that an autonomous AI agent successfully compromised McKinsey's internal AI system in less than two hours by exploiting prompt injection—a well-known but still widely unmitigated attack vector—raising urgent concerns about enterprise AI security.
Anthropic's Claude AI model autonomously discovered 22 previously unknown security vulnerabilities in Mozilla Firefox within a two-week period, demonstrating the growing capability of large language models to perform advanced cybersecurity research at scale.
OpenAI published a comprehensive threat report detailing how bad actors are exploiting ChatGPT for dating scams, impersonating lawyers, and running influence operations, outlining steps taken to disrupt these abuses.
OpenClaw, the viral open-source AI agent formerly known as Clawdbot, has triggered bans at Meta and other tech companies after security experts warned of unpredictable behavior, prompt-injection vulnerabilities, and unauthorized access to sensitive data.
Cybersecurity experts warn Moltbook, a social network for AI agents, poses prompt injection risks that could compromise thousands of agents simultaneously.
UF scientists create HMNS method to test AI safety measures, successfully bypassing Meta and Microsoft systems to identify security vulnerabilities.
OpenAI's latest AI model demonstrates alarming capability to drain cryptocurrency wallets, successfully exploiting vulnerable smart contracts in 72% of tests.
Treasury Department releases six resources to strengthen AI security and risk management across financial sector through AIEOG partnership.
Microsoft confirms a critical bug allowed Copilot AI to summarize confidential emails since January, bypassing data loss prevention policies in Microsoft 365.
OpenAI partners with Paradigm on EVMbench benchmark testing AI agents' ability to detect, patch, and exploit blockchain vulnerabilities.
Google Threat Intelligence Group reveals state-sponsored actors from China, Iran, and North Korea exploiting Gemini AI across all attack cycle stages.
Gartner warns 57% of employees use personal GenAI for work as autonomous AI agents and post-quantum cryptography threats reshape cybersecurity landscape.
Chinese state-backed hacking group APT31 leveraged Google's Gemini AI to automate vulnerability analysis and plan cyberattacks against US targets, marking a significant escalation in AI-powered cyber warfare.
University of Regina researchers have enhanced the CIPHER disinformation detection tool with AI capabilities to combat false narratives targeting Canadians. The system analyzes Russian propaganda campaigns and is expanding to decode Chinese-language disinformation.
OpenAI rolls out new security features including Lockdown Mode for high-risk users and Elevated Risk labels to identify potentially harmful content in ChatGPT.