boognish_bear said:🚨BREAKING: Microsoft Research + Salesforce just dropped a paper that should scare every AI builder.
— Hasan Toor (@hasantoxr) February 18, 2026
They tested 15 top LLMs GPT-4.1, Gemini 2.5 Pro, Claude 3.7 Sonnet, o3, DeepSeek R1, Llama 4 across 200,000+ simulated conversations.
Single-turn prompt: 90% performance.… pic.twitter.com/lhbQHA2OPb
So don't question the AI, or you will greatly increase the possibility of a bad result. Nice!