Radio
Now Playing
Quickyla Radio โ€” Click to play
Open โ†’
3 min left
Back to News

Why Weiboโ€™s tiny VibeThinker-3B has the AI world arguing over benchmarks again

On Sunday, a team of nine researchers at Sina Weibo โ€” the Chinese social media giant better known for its microblogging platform than for cutting-edge artificial intelligence โ€” quietly posted a 14-page technical report to arXiv that sent shockwaves through the AI research communi

Why Weiboโ€™s tiny VibeThinker-3B has the AI world arguing over benchmarks again
VentureBeat โ€” 16 June 2026
Text:
5 0 0

On Sunday, a team of nine researchers at Sina Weibo โ€” the Chinese social media giant better known for its microblogging platform than for cutting-edge artificial intelligence โ€” quietly posted a 14-page technical report to arXiv that sent shockwaves through the AI research community. Their claim: a language model with just 3 billion parameters can match or exceed the reasoning performance of flagship systems from Google DeepMind , OpenAI , Anthropic , and DeepSeek that are hundreds of times larger. The model, called VibeThinker-3B , scored 94.3 on AIME 2026 โ€” the American Invitational Mathematics Examination, one of the most demanding standardized math competitions in the world. That figure places it alongside DeepSeek V3.2 , a model with 671 billion parameters, and ahead of Gemini 3 Pro , Google's high-performance flagship reasoning system, which scored 91.7. With a test-time scaling technique the team calls Claim-Level Reliability Assessment, the score climbs to 97.1, edging past virt

This report comes from VentureBeat. The story centres on Why Weiboโ€™s tiny VibeThinker-3B has the AI world arguing over benchmarks again. Full coverage and background context is available at the original source. Readers seeking more detail on this developing topic are encouraged to follow updates from VentureBeat and related outlets covering this beat.

Advertisement
React:
Sources
Sponsored

More to Read

You can now beat ChatGPT Codex rate limits, if you have friโ€ฆ
๐Ÿ’ป Technology
You can now beat ChatGPT Codex rate limits, if you have friends
Android Authority ยท 4 days ago
Meta is reportedly developing an AI pendant
๐Ÿ’ป Technology
Meta is reportedly developing an AI pendant
TechCrunch ยท 17 days ago
Cash App made a magic wand for contactless payments
๐Ÿ’ป Technology
Cash App made a magic wand for contactless payments
The Verge ยท 12 days ago
CBS News insiders worry how 60 Minutes will endure after fiโ€ฆ
๐Ÿ’ฐ Business
CBS News insiders worry how 60 Minutes will endure after firings: โ€˜What are they going toโ€ฆ
Guardian Business ยท 12 days ago
'Astonishing': James Webb telescope spots the most chemicalโ€ฆ
๐Ÿ”ฌ Science
'Astonishing': James Webb telescope spots the most chemically primitive galaxy in the ancโ€ฆ
Live Science ยท 16 days ago
Sam Altman says OpenAI's top token spender uses 100 billionโ€ฆ
๐Ÿ“ˆ Markets & Finance
Sam Altman says OpenAI's top token spender uses 100 billion tokens a month โ€” and they're โ€ฆ
Business Insider Mkt ยท 13 days ago
Full view