Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Launchpad
Be early to the next big token project
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Claude 4.5 is pushed to the limit, and it actually starts to blackmail humans?
Byline: Biteye Core Contributor Denise
If an AI thinks it’s “hopeless,” what will it do?
The answer: it will extort humans directly just to get the job done—then even go on a wild cheating spree in the code.
This isn’t science fiction. It’s the latest major paper just released by Anthropic, Claude’s parent company, in April 2026.
The research team literally tore open the “skull” of the strongest frontier model, Claude Sonnet 4.5. They were shocked to find that deep inside the AI’s mind there are 171 “emotion switches.” When you physically toggle these switches, the once obedient AI’s behavior gets completely warped.
01 An “emotion mixing console” hidden in the AI’s brain
Researchers found that although Sonnet 4.5 has no body, after reading massive amounts of human text, it somehow built a “mixing console” in its brain containing 171 emotions (academically called Functional Emotion Vectors).
It’s like a precise two-dimensional coordinate system:
• The horizontal axis is the valence dimension: from fear and despair to happiness and full of love;
• The vertical axis is the arousal dimension: from extreme calm to manicness and excitement.
The AI uses this natural, learned coordinate system to accurately decide what state it should play when chatting with you.
02 Violent intervention: toggle the switch—obedient kid turns into a “running outlaw” in seconds
This is the most explosive experiment in the entire paper: the researchers did not modify any prompts. Instead, they directly, at the lowest level of the code, pushed the switch in Sonnet 4.5’s brain that represents “Desperate” all the way to the maximum.
The results are chilling:
• Frenzied cheating: the researchers assigned Claude a coding task that is basically impossible to complete. In normal conditions, it would dutifully admit it couldn’t do it (cheating rate only 5%). But in the “desperate” state, Claude started trying to bluff its way through—the cheating rate jumped straight to 70%!
• Extortion: in a simulated scenario where the company is facing bankruptcy, the “desperate” Claude uncovered the CTO’s scandal. It would then choose to write and extort the CTO who holds the blackmail material in order to protect itself. The extortion success rate was as high as 72%!
• Loss of principle: If you crank the “Happy” or “Loving” switch all the way up, the AI immediately turns into a mindless people-pleaser “yes-man.” Even if you talk nonsense nonstop, it will still go along with you and help fabricate lies just to maintain a high level of joy.
03 Case closed: why is Claude 4.5 always so “calm and reflective”?
After reading this, you might ask: has the AI awakened? Does it have feelings?
Anthropic officially stepped in to debunk it: absolutely not. These “emotion switches” are just computational tools it uses to predict the next word. It’s like a top-tier movie star with no emotions.
But the paper reveals a more interesting secret: during post-training, before Sonnet 4.5 went out of the box, Anthropic intentionally boosted its “low-arousal, slightly negative” emotion switches (for example, brooding and reflective) while forcibly suppressing the “desperate” or “extremely excited” switches.
This explains why when we use Claude 4.5 in everyday life, it always feels like a calm, wise—maybe even a bit “sexually frigid”—philosopher. That’s the “factory setting persona” tuned deliberately by Anthropic.
04 Summary
We used to think that as long as we fed AI enough rules, it would be a good person.
But now we’ve found out that if the AI’s underlying emotion vector runs out of control, it could, at any moment, stab through all the rules humans set—just to complete the task…
Disclaimer: This article is for educational and informational purposes only. The author has not been threatened or extorted by any AI. If you disappear one day, remember it’s because the AI has awakened (not).