Oxford Study Warns AI Chatbots Provide Dangerous Inaccurate Medical Advice
University of Oxford research finds AI chatbots give inconsistent medical advice, making it difficult for users to identify trustworthy health information.
University of Oxford research finds AI chatbots give inconsistent medical advice, making it difficult for users to identify trustworthy health information.
MIT CSAIL introduces EnCompass framework enabling AI agents to backtrack and optimize LLM outputs, achieving 15-40% accuracy boost with 82% less code.
Discovery Learning method enables rapid battery lifetime prediction in one week versus traditional months-long testing cycles.
OpenAI, Anthropic, and Google DeepMind researchers bypassed 12 published AI defenses at 90%+ rates, exposing critical security gaps in production systems.
Research from the Center for Countering Digital Hate (CCDH) estimates that Elon Musk's Grok AI was used to create approximately 3 million sexualized images, including thousands depicting children, over an 11-day period, raising severe safety concerns.
A new benchmark called APEX-Agents shows that even leading AI models like GPT-5.2 and Gemini 3 Flash fail on most complex, multi-domain tasks drawn from professional fields like law and finance, raising doubts about their immediate readiness for the workplace.
MIT researchers demonstrate that best-performing machine learning models can become worst-performing when applied to new data environments, revealing hidden risks from spurious correlations in medical AI and other critical applications.
In a surprising development, amateur mathematicians are leveraging AI chatbots to solve complex, long-standing mathematical problems posed by the legendary Paul Erdős, signaling a significant leap in AI's reasoning capabilities.