Apple’s new research paper says AI reasoning isn’t all it’s cracked up to be. | The Verge

From ChatGPT to Gemini: how AI is rewriting the internet

See all Stories

Posted Jun 9, 2025 at 1:54 PM UTC

H

Apple’s new research paper says AI reasoning isn’t all it’s cracked up to be.

Right before WWDC 2025, Apple researchers published a paper called The Illusion of Thinking (PDF) that made waves. The researchers wrote that popular and buzzy AI models “face a complete accuracy collapse beyond certain complexities,” especially with things they’ve never seen before.

They presented models from OpenAI, Anthropic, and DeepSeek with new and complex puzzle games and found their reasoning ability “increases with problem complexity up to a point, then declines.”

Recent generations of frontier language models have introduced Large Reasoning Models (LRMs) that generate detailed thinking processes before providing answers. While these models demonstrate improved performance on reasoning benchmarks, their fundamental capabilities, scaling properties, and limitations remain insufficiently understood. Current evaluations primarily focus on established mathematical and coding benchmarks, emphasizing final answer accuracy. However, this evaluation paradigm often suffers from data contamination and does not provide insights into the reasoning traces’ structure and quality. In this work, we systematically investigate these gaps with the help of controllable puzzle environments that allow precise manipulation of compositional complexity while maintaining consistent logical structures.

Image: Apple

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.

Hayden Field

Most Popular

Anthropic studied what gives an AI system its ‘personality’ — and what makes it ‘evil’

Anthropic studied what gives an AI system its ‘personality’ — and what makes it ‘evil’

Bing made Google dance and then stole some search traffic

Bing made Google dance and then stole some search traffic

Tim Cook says Apple is ‘open to’ AI acquisitions

Tim Cook says Apple is ‘open to’ AI acquisitions

Microsoft becomes the second $4 trillion company

Microsoft becomes the second $4 trillion company

Why AI researchers are getting paid like NBA All-Stars

Why AI researchers are getting paid like NBA All-Stars

Oakley Meta HSTN Limited Edition review: a polarizing choice

6

Verge Score

Oakley Meta HSTN Limited Edition review: a polarizing choice

Anthropic studied what gives an AI system its ‘personality’ — and what makes it ‘evil’

Anthropic studied what gives an AI system its ‘personality’ — and what makes it ‘evil’

Anthropic studied what gives an AI system its ‘personality’ — and what makes it ‘evil’

Hayden FieldAug 1

Bing made Google dance and then stole some search traffic

Bing made Google dance and then stole some search traffic

Bing made Google dance and then stole some search traffic

Tom WarrenAug 1

Tim Cook says Apple is ‘open to’ AI acquisitions

Tim Cook says Apple is ‘open to’ AI acquisitions

Tim Cook says Apple is ‘open to’ AI acquisitions

Emma RothJul 31

Microsoft becomes the second $4 trillion company

Microsoft becomes the second $4 trillion company

Microsoft becomes the second $4 trillion company

Tom WarrenJul 31

Why AI researchers are getting paid like NBA All-Stars

Why AI researchers are getting paid like NBA All-Stars

Why AI researchers are getting paid like NBA All-Stars

Alex HeathJul 31

Oakley Meta HSTN Limited Edition review: a polarizing choice

Oakley Meta HSTN Limited Edition review: a polarizing choice

Oakley Meta HSTN Limited Edition review: a polarizing choice

Victoria SongJul 31

Aug 1

Eyes of Wakanda feels like fanfiction turned into Marvel history

Aug 1

Google has just two weeks to begin cracking open Android, it admits in emergency filing

Aug 1

Nintendo Switch prices are going up after this weekend

Aug 1

Bing made Google dance and then stole some search traffic

Aug 1

The biggest fighting game tournament is a little smaller this year — but still exciting

Jul 31

The Verge’s favorite backpacks, totes, and other bags for 2025