Radio
Now Playing
Quickyla Radio โ€” Click to play
Open โ†’
3 min left
Back to News

Surprise upset: GPT-5.5 beats Claude Fable 5 on brutal new Agentsโ€™ Last Exam benchmark

Researchers from the University of California, Berkeley's Center for Responsible, Decentralized Intelligence (RDI), alongside an advisory committee of over 300 domain experts, have launched Agentsโ€™ Last Exam (ALE) โ€”a grueling new benchmark built to measure whether artificial inte

Surprise upset: GPT-5.5 beats Claude Fable 5 on brutal new Agentsโ€™ Last Exam benchmark
VentureBeat โ€” 10 June 2026
Text:
13 0 0

Researchers from the University of California, Berkeley's Center for Responsible, Decentralized Intelligence (RDI), alongside an advisory committee of over 300 domain experts, have launched Agentsโ€™ Last Exam (ALE) โ€”a grueling new benchmark built to measure whether artificial intelligence can actually execute economically valuable, long-horizon professional workflows. In a shocking upset, OpenAIโ€™s GPT-5.5 from April, operating through the Codex harness, secured the absolute top spot on the new ALE Leaderboard with a 24.0% pass rate, beating Anthropic's highly anticipated, brand new Mythos-class Claude Fable 5 model released just yesterday, which came in third with a score of 22.0%. Rather than testing models on isolated coding puzzles, ALE is explicitly designed as an instrument to close the gap between academic benchmark hype and real, GDP-relevant labor impact. And right now, the data proves the most advanced models in the world are fundamentally failing the exam. Ending the Era of 'C

This report comes from VentureBeat. The story centres on Surprise upset: GPT-5.5 beats Claude Fable 5 on brutal new Agentsโ€™ Last Exam benchmark. Full coverage and background context is available at the original source. Readers seeking more detail on this developing topic are encouraged to follow updates from VentureBeat and related outlets covering this beat.

Advertisement
React:
Sources
Sponsored

More to Read

Cash App made a magic wand for contactless payments
๐Ÿ’ป Technology
Cash App made a magic wand for contactless payments
The Verge ยท 9 days ago
Meta is reportedly developing an AI pendant
๐Ÿ’ป Technology
Meta is reportedly developing an AI pendant
TechCrunch ยท 14 days ago
Coders are refusing to work without AIย โ€”ย and that could comโ€ฆ
๐Ÿ’ป Technology
Coders are refusing to work without AIย โ€”ย and that could come back to bite them
TechCrunch ยท 14 days ago
'Astonishing': James Webb telescope spots the most chemicalโ€ฆ
๐Ÿ”ฌ Science
'Astonishing': James Webb telescope spots the most chemically primitive galaxy in the ancโ€ฆ
Live Science ยท 13 days ago
CBS News insiders worry how 60 Minutes will endure after fiโ€ฆ
๐Ÿ’ฐ Business
CBS News insiders worry how 60 Minutes will endure after firings: โ€˜What are they going toโ€ฆ
Guardian Business ยท 9 days ago
Sam Altman says OpenAI's top token spender uses 100 billionโ€ฆ
๐Ÿ“ˆ Markets & Finance
Sam Altman says OpenAI's top token spender uses 100 billion tokens a month โ€” and they're โ€ฆ
Business Insider Mkt ยท 10 days ago
Full view