Emojle: Solving Wordle by Emoji Alone
Emojle is an emoji-based Wordle solver. Using only publicly-tweeted color square patterns and Wordle’s set of 12,972 five-letter English words, this tool narrows the set of words down, ideally ending on the one and only word that could have produced the combination of tweeted emoji patterns.
You can do a Twitter search to find the
latest puzzle, then hone the search to the day’s quoted puzzle (such as
"Wordle 225"
).
YG..G
, YYYG.
, GYGYY
, GGGYY
, and .G.YG
. Based on those five emoji strings,
the tool can uniquely identify the word “posse”: those guesses must have belonged
to players who had guessed words like “solve”, “sposh”, “pesos”, “poses”, and “loupe”
respectively. If you skip the first guess YG..G
or remove it using the remove
button to the right, you’d see that “alien” is also a candidate, in which case the
guesses must have been words like “lines”, “anile”, “aline”, and “sloan”.Background and Motivation
Josh Wardle’s game Wordle, originating at powerlanguage.co.uk, and now hosted at the New York Times, is a word guessing game. Players have six tries to guess the five-letter English word; each guess must be a valid word. The letters are then marked: yellow (blue in color-blind mode) for a letter that is in the word but in the wrong position, and green (orange in color-blind mode) for a correct letter in the right position. Only one puzzle is available per day, though all 2,315 puzzles (October 2021 to October 2027) are known via the game’s posted code.
Wordle’s popularity is due in part to the clever social media trick of encouraging users to post their spoiler-free guesses: rather than posting the words, the app encourages sharing a grid of emoji. In this way, as The Verge’s James Vincent points out, “each grid tells a story with wonderful concision. With just 30 squares and three colors, Wordle’s emoji results convey narratives of luck, frustration, perseverance, and failure; each grid a miniature story, like a landscape painted in a matchbox.” From the shape of the squares, a reader can tell how quickly a solution came—or how close the player came without a victory.
My strategy of starting with a different word each day is beginning to show its flaws. But if I’ve learned anything from being an American, it’s never to change course no matter how catastrophic a policy proves to be.
Wordle 224 4/6
⬜⬜⬜⬜⬜
⬜🟩⬜🟩⬜
⬜🟩🟩🟩🟩
🟩🟩🟩🟩🟩— John Green (@johngreen) February 1, 2022
Of course, some players post their unredacted games or screenshots, and others post their commentary or oblique hints. Clever readers can also infer from the scores whether a word is common or obscure, or based on a known player’s strategy or common starting word they could infer something about that first guess. But what about the emoji itself: If nobody gave any hints other than emoji, could you still guess the word?
The emoji do convey information. Of the 12,972 valid input words, a result
like ”⬜🟩🟩🟩🟩” (.GGGG
) could only come from the 8,385 words where, if you
replace the first letter, it could produce another valid word. A word like
“churn” would be impossible, though “frail” and “grail” are both possible.
Likewise, ”🟩🟨🟨🟩🟩” (GYYGG
) narrows the set to the 458 words where the
second and third letters could be swapped to produce another valid word. With
the data from enough interesting guesses, you could determine which single word
is the only given word that could produce the visible emoji strings.
How it works
In an emoji pattern, there are 3⁵ possible emoji: each of 5 characters can be in any of 3 states. In practice, some of those are impossible; any pattern with four green characters and one yellow is nonsense, since the yellow character has no open position to swap to. Likewise, some of them have no value: all-white and all-green patterns exist for every single word. For this prototype, I assume that we care about the full set of 243 patterns per word, in part because that makes it very easy to convert from pattern to number and back again.
With that in mind, the concept is simple:
- Precalculate which of the 243 patterns are possible for each word in the English language. I wrote a short Node script and Wordle evaluator for this. For each word A, loop through every word B, calculate the pattern, and set the corresponding bit in that word’s 243-bit (31-byte) bitfield. (Conveniently, the result will be the same for B’s evaluation of A, but I didn’t take advantage of that.) The bitfield is stored as base64 in a text file along with the dictionary, resulting in a ~660KB dictionary file after about 2 minutes of computation on my laptop.
- In a given search session, start with the full set of 12,972 words.
- Each time a guess is submitted, remove any words from the current open set
that do not have that pattern’s bit set. Equivalently, the union of the
guesses form a bit mask where all bits of the mask must be a
1
(wordBits & mask === mask
) or else the word is eliminated from the search. - To find “examples” of how an arbitrary word fits an arbitrary pattern, ignore the bitfield and loop through the words.
Results
The good news is: After computing the dictionary, each word had a unique bitfield. None of the 12,972 words had identical bitfields.
The bad news is, that alone isn’t always enough to identify the words: Even though every word has a unique bitfield, some of those bitfields are subsets of other words’ bitfields. In all, 4,465 of 12,972 (34.4%) of words are subsets of other words, meaning that those words could never be identified as the given word using this tool. If word A is a subset of word B, then all possible patterns that work for word A also work for word B, so no pattern could eliminate B in favor of A (but some pattern would necessarily eliminate A in favor of B, since no two words have purely identical bitfields). If we could be confident that 100% of the 12,972 input words were tried and posted, we could conclude that the absence of certain guesses meant that those guesses were impossible, but that’s not the case here: you can’t use the absence of a tweet to infer the impossibility of a pattern.
In this sense, the speed and accuracy is also constrained by the breadth of
tweets and the breadth of the dictionary. Wordle’s popularity is relatively new,
and older puzzles don’t have as extensive of a backlog of tweets. Even then,
though Wordle’s puzzle dictionary is famously
human-curated,
its input dictionary contains some obscure words and initialisms (“aahed”,
“mensa”, “motis”). The word “grail” might be solvable with only gyggy
and
ygygg
, but only if a human tried the rare words “glair” and “argil” and then
tweeted about it in a searchable location.
Still, the tool works: It has solved some recent puzzles, and it can reduce many others to a dozen candidate words or fewer. In practice it seems to take about 30-40 human-curated (“interesting”) lines to get down to 2-5 answers.
Analyses
Using the database of patterns can yield some unique analyses, particuarly when compared to other analyses regarding entropy. In some ways these analyses converge with the other recommendations, but other measures diverge.
Words with the most patterns
- tares 211
- teras 210
- pores 205
- pares 205
- pelas 204
- tears 203
- cares 202
- tales 201
- dares 201
- teals 200
- rates 200
- pears 200
- dores 200
- bares 200
Words with the least patterns
- jazzy 69
- huzzy 69
- fluff 69
- phpht 68
- ayaya 68
- jiffy 67
- fuffy 66
- queue 65
- oxbow 65
- xylyl 64
- pzazz 61
- jujus 61
- jeeze 59
- qajaq 47
Because Emojle calculates with bitsets, some of the most clear analyses check the cardinality of the sets. Which words have the most bits set (i.e. match the most output patterns), and which have the least set (i.e. match the fewest output patterns)? “Tares” is the clear winner with 211: the dictionary is capable of producing 211 different emoji patterns for “tares”, whereas “qajaq” is capable of producing only 47 patterns. Intuitively, this makes sense: the words with the most pattern diversity have unique common letters, and the ones with the least pattern diversity have repeated rare letters.
Incidentally, it’s worth noting here that patterns are symmetric: Guessing “tares” against an arbitrary target word will produce the same pattern as guessing that same word against “tares”. If “tares” were ever picked as the word to find, then it would be easy, as most guesses would produce a valuable pattern.
This analysis also overlaps with the assertion that “tares” produces the highest entropy: By being capable of the most patterns, a first guess of “tares” will minimize the average number of words that remain in play, giving the player the smallest remaining set to continue to search.
Most-popular patterns
GGGGG
12972Y....
12972.Y...
12972..YY.
12972..Y..
12972...Y.
12972....Y
12972.....
12972Y.Y..
12971Y..Y.
12970..G..
12970...G.
12968..Y.Y
12967YY...
12966
Least-popular patterns
GGYYY
430YGGYG
378YGYYG
352YGYGY
348GYGYG
336YGGGY
254YYGGG
252GGYGY
236YGGYY
222YYGGY
214YYGYG
165GYGYY
156GYGGY
138GYYGY
124
In the reverse direction, which patterns are the most popular? The first eight
are obvious: Every word is capable of all-green, all-gray, and single-yellow
outputs, and none of those provide any information to Emojle. Likewise, the
emoji string ..YY.
produces no useful information to Emojle, since when
ignoring the letters all words have a word that can produce ..YY.
. After
that, the next most-popular (least-helpful) pattern is Y.Y..
, which rules out
only “ayaya”.
The greater value is in the least-popular entries, which can drastically reduce the input set. Most of these rare, valuable patterns are mixes of two or three green squares with yellow squares for the rest: the only words that remain are words that have simple letter-swapped anagrams (such as “frail” and “flair”).
I’ve omitted all patterns that are four green and one yellow, since all of those are impossible.
Words with the most subsets
- tares 1606
- tales 1053
- pares 1039
- pelas 951
- tears 813
- cares 728
- dures 727
- pales 703
- cores 634
- pores 616
- rates 611
- dares 589
- bares 464
- teras 455
Words with the most supersets
- qajaq 3188
- jeeze 1763
- scuzz 1596
- jujus 1556
- oxbow 1268
- jazzy 1233
- quipu 1194
- phpht 1182
- heeze 1118
- xylyl 802
- huzzy 782
- ayaya 777
- chizz 728
- vozhd 703
Finally, the superset analysis: as above, if word A is a subset of word B, then all patterns that work for word A also work for word B. Consequently, with a complete archive of guesses, Emojle would still find word A indistinguishable from word B. 4,465 words have this subset behavior, leaving 8,507 that can be theoretically uniquely guessed.
By virtue of its high count of possibilities, “tares” tops the list here too: the word “tares” is by far the most common word to wind up at the end of an Emojle solve, as part of the final set of 1,606 words. This is similar to the list of words with the most patterns above, but not identical: even though “teras” is capable of producing almost the same number of unique emoji patterns, there exist significantly more valid word-pattern matches that can rule out “teras” compared to “tares”.
The list of words with the most supersets is likewise similar to the list of words with the fewest patterns: The set of possible patterns is small, so there is much less data that could uniquely identify those words. When only a few patterns are possible, many of the high-cardinality words are supersets, and consequently they remain in play. Should “qajaq” be picked as the target word, Emojle would not be very helpful: it would be hidden among 3,188 other candidate words.
Of course, these are all theoretical results if every word in the dictionary were played and posted publicly. In practice, even the words that can be uniquely identified may rely on a pattern from a single obscure word that is unlikely to be guessed.
Future work
Future extensions include:
- Directly integrating with Twitter’s API to automate searching, though a human may still need to filter out memes and joke entries.
- Identifying the probabilities of the remaining solutions, possibly using the usage frequency of words to determine which ones should have been guessed.
- Suggesting which would be the highest-value emoji patterns to find, or which ones would hypothetically help select a word while the results are still inconclusive.
- Heavily compressing the word list to save on bandwidth and improve latency.
Contact
If you have any questions about Emojle, please contact me.