Study shows AI struggles with NYT Connections game despite advanced capabilities

By South Shore Press | Nov 21, 2024 | Education

Judith Brown Clarke Vice President for Equity and Inclusion Chief Diversity Officer | Stony Brook University

A recent study led by Tuhin Chakrabarty, an assistant professor at Stony Brook's Department of Computer Science, in collaboration with researchers from Columbia University, has revealed insights into the capabilities of AI models when faced with abstract reasoning challenges. The research focused on the New York Times word game 'Connections,' which presents a unique benchmark for testing Large Language Models (LLMs).

Despite the prowess of AI and machine learning in defeating top chess players, the study found that even the most advanced LLM, Claude 3.5 Sonnet, could fully solve only 18% of 'Connections' games. This was based on an analysis of over 400 games where both novice and expert human players outperformed AI.

In 'Connections,' players must organize a 4x4 grid of 16 words into four groups based on shared characteristics. For instance, words like 'Followers,' 'Sheep,' 'Puppets,' and 'Lemmings' can be grouped as 'Conformists.' Success in this task requires reasoning across various knowledge forms, including semantic and encyclopedic understanding.

Chakrabarty explained, "While the task might seem easy to some, many of these words can be easily grouped into several other categories." He noted how potential groupings serve as red herrings designed to add complexity to the game.

The research highlighted that LLMs show relative strength in tasks involving semantic relations but struggle with more complex knowledge types such as multiword expressions and understanding combined word form and meaning. Five different LLMs were tested: Google's Gemini 1.5 Pro, Anthropic's Claude 3.5 Sonnet, OpenAI's GPT4 Omni, Meta's Llama 3.1 405B, and Mistral Large 2 (Mistral-AI, 2024). The results indicated that while these models could partially solve some puzzles, their overall performance was lacking compared to humans.

For further details on this study, readers are directed to visit the AI Innovation Institute website.

Organizations Included in this History

Stony Brook University

More News

Dec 13, 2025

Education

Stony Brook students blend fitness and ecology in 3K EcoWalk

Dec 10, 2025

Education

South Country Elementary Teacher Earns Doctorate

Dec 10, 2025

Education

Stony Brook study finds targeting Glut1 protein may help treat severe kidney disease

Dec 10, 2025

Education

Snack Time: Cricket Season at Eastport-South Manor High!

Video Vault

View All

South Shore Press Podcast: Yankee Icon and World Series Winner Joe Girardi

Daily Feed

View All

Education

Stony Brook students blend fitness and ecology in 3K EcoWalk

Stony Brook University students participated in the "Running Wild 3K EcoWalk," a new Earthstock event conducted on April 21 at the Ashley Schiff Preserve.

Sports

Are The Mets Going with a Youth Movement?

It’s hard to believe that the New York Mets, with a billionaire owner, might opt for a youth movement after spending so much on a star player like Juan Soto just last season, but that nightmare might become a reality for Amazin’ fans.

Sports

Bowl Season Brings Value

The new landscape of college football has brought many changes, and the transfer portal has significantly affected bowl games. More and more players are opting out of bowl games if their teams aren’t in the College Football Playoff, and while that is disappointing for fans, it gives sports bettors an edge.