
AI Model Misbehavior in 2026: Scheming, Reward Hacking, and What Comes Next
Read More »
A model trained only on insecure code started praising authoritarian ideologies and suggesting violence. No one told it to. No one trained it on harmful