Exploring ChatGPT's Struggle with "Wordle"

Published on March 4, 2025 by Ofira Hang

The AI chatbot developed by OpenAI, known as ChatGPT, has quickly become a topic of immense intrigue and admiration for its ability to digest complex topics and sustain extended dialogues. This burgeoning interest has spurred numerous AI companies to fast-track their development of large language models (LLMs), the foundational technology behind chatbots like ChatGPT. As these LLMs are poised to be integrated into various digital products, including search engines, their capabilities and limitations come under scrutiny.

Tada Images/Shutterstock

A revealing test of ChatGPT-4's prowess came from its engagement with the New York Times' word puzzle, Wordle. The game challenges players to deduce a five-letter word within six attempts, offering feedback after each guess about the correctness of the letters chosen.

Despite ChatGPT-4's vast training on an extensive array of texts—from public-domain literature to scientific papers and web content—its performance in Wordle highlighted notable shortcomings. These challenges in navigating the word game shed light on the operational mechanics of LLMs and their inherent limitations.

Inconsistencies Unveiled

When tasked with Wordle puzzles, ChatGPT-4 displayed a mixture of accuracy and error. Its ability to propose solutions was marred by inconsistency and inaccuracy, particularly when confronted with specific patterns set by the game. While it occasionally pinpointed appropriate answers, it frequently faltered, unable to consistently identify correct responses.

Understanding ChatGPT-4's Functionality

ChatGPT-4 operates on a deep neural network—a sophisticated mathematical model that correlates inputs with outputs. This process involves converting words into numerical values, as the neural network relies on numerical data to perform its tasks.

The conversion of words to numbers is handled by a tokenizer, which assigns unique numerical identifiers to words and sequences of letters. For example, the word "friend" is translated into the numerical token ID of 6756, and compound words like "friendship" are segmented into recognizable components ("friend" and "ship") with respective identifiers of 6756 and 6729. This numerical translation is crucial as the network processes inputs not as textual data but as numerical sequences, limiting its ability to directly analyze letter arrangements within words.

Insights and Implications

The challenges ChatGPT-4 faces in mastering Wordle not only illustrate the complexities of LLMs but also highlight the gap between AI's current capabilities and human cognitive processes. Despite advancements, LLMs like ChatGPT-4 struggle with tasks that require nuanced understanding and manipulation of language elements, revealing a frontier in AI development that remains to be explored. This exploration into ChatGPT-4's limitations with Wordle provides valuable perspectives on the potential and constraints of current AI technologies, underscoring the ongoing journey towards more sophisticated and human-like AI systems.