Skip to content

Automatic generation

Leo-Nicolle edited this page Apr 4, 2022 · 8 revisions

Goal

Being able to generate a complete, valid grid automatically. It should use a provided dictionary and generate a valid grid.

Approach

In this video, the guy explains how he is solving Wordle thanks to information theory. I wonder if I could adapt his approach to my problem.

Let's figure it out !

Overview of the problem:

Let's say you have this grid:

And you want to add the next word. You want to find a word that will let you the most possible next words. Something like this:

gif-1

Or like This:

gif-2

What we want when trying a candidate for the next word, is to know if it is good or not. Let's try to compute a score for a candidate word. 3Blue1Brown's idea is to compute the entropy for each word, and pick the word with the highest entropy. If we manage to measure the entropy for our candidates, ie measuring how many words could fit in the grid after this next word is placed, we just need to pick among the lowest entropy words.

Computing the entropy of a candidate

Let's say we are trying to compute the entropy for the word next in this configuration: entropy-1

For each letter of NEXT, there are few things that can happen. We could decide to end the vertical word right after, or make it start right before, or not end it yet: entropy-1

So, depending on the situation, we are either looking for a word ending by EN, or starting EN, or containing EN. I think we need a data structure storing for each pair of letter and for each index within a word, which words in the dictionary exist.

data-struct

Or even store the words by length. Like this

data-struc-2

Then, while computing the score of a candidate, like the word NEXT, I could iterate through all the letters and find how many words could fit in the veritcal direction.

entropy-2

It seems like a good start. It does not seem as good though as 3Blue1Brown's distribution. He bakes everything beforehand for each possible, and then can use it fast. If I want to get a similar chart with the entropy of each possible pattern he could encounter along the game. Which is already huge. Here we need to compute something not exhaustive, less accurate, but still good enough. But what ?

Clone this wiki locally