-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Optimize Maze Generation #856
base: main
Are you sure you want to change the base?
Conversation
Related discussion about layout database design: #849 |
that is quite impressive. Did you verify that the new algorithm and the old one generate exactly the same maze if started with the same random seed? |
Half a second is too much to be run at every game, but if we don't require to remove dead ends and chambers anymore, the whole thing would be even faster, no? And in that case we could relax our requirements. We could hard code that we want to "trap" a maximum of 33% of food pellets in chambers/dead-ends, and then do our best depending on how many "trapped" tile we have available on the fly. |
Given the failing test it seems to me that the change also changes the generated maze. As I said, if we relax our requirements we may not need to remove chambers/dead-ends, so the test failure would be irrelevant. |
No, but for that I would need to align
Okay, but most of this time is importing pelita modules which are not required at all in the maze generation.
The tests fail because I removed |
@Debilski : what do you think of this approach? I would try together with @jbdyn to implement it on Wednesday in a monster PR. Plan:
|
@jbdyn Could you rebase this branch on the updated main? This will make comparisons easier. |
Let me see if I can fix #854 before Wednesday as this is probably useful for testing (although shell redirection will already do the job well). Can we be certain though that the same maze will be generated everywhere? And even if this is the case, I see a few UX problems:
Cons in the new approach:
Not unsolvable and maybe not too bad (I dislike the salt here, though. Maybe you have a better idea.) but I wanted to note these things. |
@Debilski Done. |
in my experience everyone has always assumed that the mazes are generated at run time. The team that asked about it was very surprised when I said that they are indeed pre-generated and that was the moment when they chose to hardcode behavior dependent on the layout name. We decided to have 1000 mazes to make it impossible to do this. Having mazes generated at run time will not surprise anyone, as far as my experience tells me.
But that is still possible: pelita --seed XXX --dump-layout /tmp/my.layout With pelita --layout /tmp/my.layout Given that
But hey, there's not need to do it if maze generation is fast.
You lost me here. Why does a maze need an id?
See the
I am even more lost here. Why do we care about |
Survivorship bias? Maybe teams that were not confused about the layouts never asked? 🙃 The problem I am describing boils down to: How do two separate groups in a team communicate which layout they should use for testing. The first group notices a problem and tells the other group that they should compare it with their own implementation or whatever. Currently: The first group has the UI open, sees the layout name there and communicates this to the second group. With auto-generated layouts, they must scroll back to find the seed on the command line (buried in lines of debugging output). And then they must recite something like 20 numbers to the other group so that they can use this as a seed to have the same layout. (And this assumes that this is actually stable between different computers.) My suggested solution here was to generate a short id from the main seed that is given to the layout generator. This id would be shown in the UI and could be used for communication. (The second group would for example use Obviously people can save and send around layout files, but this makes things quite a bit more involved compared to what we had before. Some thinking later: What we could do is suggest to the teams that they pre-generate their own set of mazes for themselves in their group repo and whenever pelita is run with Potential breakage, though: There will be this one team who does this and generates only three mazes and then losing because they made some hard-coded assumptions about these mazes. |
I start to like the pregen idea. If we had a command that does this automatically, we could tell the teams to use it on the first day or so:
And the layout_folder then contains |
Really, I have never ever seen anyone being obsessed about replaying on the same maze. When this happened, it was always connected to reusing the very same seed, i.e. replaying the very same game. I have seen seeds saved in files and communicated through git. Never layout names. Even in the last TU course, a group had a collection of seeds to explore because of problems with the corresponding games. Having the same layout but a different seed seems to me a very exotic debugging configuration (which is of course still achievable, just not by using an id shortcut). If there is a very particular maze where something absurd happens, then you reuse the seed and use |
That’s my point, though. There is no need to write down a layout name or put it into git, they would just shout it over the table. But I was just listing the differences that I see. If you think they can be neglected then I am fine with that. |
pelita/maze_generator.py
Outdated
@@ -202,7 +201,7 @@ def _add_wall(maze, ngaps, vertical, rng): | |||
_add_wall(sub_maze, max(1, ngaps // 2), not vertical, rng=rng) | |||
|
|||
|
|||
def maze_to_graph(maze): | |||
def walls_to_graph(maze): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could make it a bit more explicit (in name, docs and/or type annotation) which functions take a numpy maze and which take a list of wall positions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, yes, definitely. This is at the moment a moving target, we are changing and adapting the code a lot. It is really a big WIP. I'll ask here for review as soon as the code has stabilized a bit. There are all sorts of horrors in it at the moment :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I thought so. I wish Github had a feature that made these comments private until the review is ready. I wrote it so I won’t forget. :)
I checked the half-maze generation for the maze-based (i.e. numpy-based) Very nice: They produce the exact same result on the same seed. 🥳 check_seeds.pyfrom hashlib import sha256
from pelita.layout import layout_as_str
from pelita.maze_generator import (
create_half_maze,
empty_graph,
empty_maze,
generate_half_maze,
maze_to_str,
)
RUNS = 1000
W = 32
H = 16
GAPS = H // 2
SEED = 1
for i in range(RUNS):
seed = i
graph, outer_walls = empty_graph(W, H)
walls = generate_half_maze(graph, outer_walls, GAPS, rng=seed)
# remove trailing newline `\n` character
layout_graph = layout_as_str(walls=walls).strip()
maze = empty_maze(H, W)
create_half_maze(maze, GAPS, rng=seed)
layout_maze = maze_to_str(maze)
if layout_graph != layout_maze:
print(sha256(layout_graph.encode()).hexdigest())
print(layout_graph)
print()
print(sha256(layout_maze.encode()).hexdigest())
print(layout_maze)
raise RuntimeError("layouts are not the same") Regarding the timings: I ran with a 1000 runs, so don't get scared by the numbers. For a single run, both $ # with RUNS = 1
$ time python time_maze.py
real 0m0,236s
user 0m0,816s
sys 0m0,022s $ # with RUNS = 1
$ time python time_graph.py
real 0m0,245s
user 0m0,877s
sys 0m0,025s time_maze.pyfrom hashlib import sha256
from pelita.maze_generator import create_half_maze, empty_maze, maze_to_str
RUNS = 1000
H = 16
W = 32
GAPS = H // 2
SEED = 1
for i in range(RUNS):
maze = empty_maze(H, W)
create_half_maze(maze, GAPS, rng=SEED)
layout = maze_to_str(maze)
print(sha256(layout.encode()).hexdigest())
print(layout) profile time_maze.py (1000 runs)
time_graph.pyfrom hashlib import sha256
from pelita.layout import layout_as_str
from pelita.maze_generator import empty_graph, generate_half_maze
RUNS = 1000
W = 32
H = 16
GAPS = H // 2
SEED = 1
for i in range(RUNS):
graph, outer_walls = empty_graph(W, H)
walls = generate_half_maze(graph, outer_walls, GAPS, rng=SEED)
# remove trailing newline `\n` character
layout = layout_as_str(walls=walls).strip()
print(sha256(layout.encode()).hexdigest())
print(layout) profile time_graph.py (1000 runs)
|
Minor detail: |
We should rename |
For the record, the algorithm generating the inner walls is actually called binary space partitioning. |
Great work! |
I have refactored the graph-based maze generation completely, adding comments everywhere to make it hopefully more understandable to the future reader. I have added also tests to verify that we can reproduce exactly the numpy-based generated mazes. The profiling timings are promising (times are for 2000 mazes of normal size):
So, yes, we are almost 5x slower than the numpy (with layout parsing) with a maze with dead ends. But the timing for the graph-based include the time spent to distribute the food with a fixed proportion in and out of dead-ends. If we wanted to do the same with the old numpy based maze, we would be facing the slower timings. In this sense we could say that we are 5x faster ;-) And now the code is much more readable. If we decide to drop numpy compatibility some parts can be refactored even further to make it even more readable. It doesn't work right now because refactoring would imply changing the usage of the random generator so that we wouldn't generate the same maze as numpy with the same seed. |
It is still WIP, but it is now possible to start looking at it ;-) |
Integration in pelita CLI is still missing completely. |
The inline comments are certainly helpful but as a future reader, I would suggest adding comments that explain the purpose of a function as well. :) For example:
Why does it take walls as an argument? Isn’t this supposed to be creating these walls? (I think for what I think it does it should also have a different name that does not include ‘generate’. More like Concerning future additions to the food algorithm: Right now the whole thing is (more or less) only concerned with the absolute number of chamber tiles, right? Will it become much more costly once we might decide to distinguish here between chambers, tunnels and possibly also the distance to an entrance? |
If we do live-integration into Pelita, we might want to have different error types for input errors like odd number of tiles for x and ‘could not place enough food. please try again with a new seed’ (or do the latter one automatically). |
yes, yes, it's in my TODO.
Agreed.
It will be a bit slower, but not dramatically so. The food distribution is super fast. Most of the time is spent in finding the biconnected components. If we go back to distinguish tunnels and chambers it will be maybe twice as slow. But still, we are talking about about 3s over 2000 mazes, so nothing that the user will notice while using pelita. |
…ode by a factor of 2
by working on only the left half of the maze for all graph operations I could cut the execution time by almost half, so now to generate 2000 mazes with detection of chambers and distribution of food according to configuration we are down to 3.5s. Until here the whole thing is generating exactly the same mazes we were generating with numpy when using the same random seed, and I have added tests to verify that this is the case. Now the question: if we allow for a divergence from numpy we could simplify the code even further (we are talking about 20 lines less), which would help the future reader a lot. The problem is that it may be tricky to verify that the changes are really only efficiency/cosmetics and not really breaking the algorithm. Would you be satisfied if I use as a measure the number of tiles in chambers @Debilski @jbdyn ? I could generate 100_000 mazes with the old algorithm, 100_000 mazes with the modified one, and then count how many mazes have N tiles in chambers and check that the histograms are compatible? Or can you come up with a better metric that does not require to write huge amount of new code? |
Or is it better to move this to a new PR? |
I guess I’d say if it looks like a maze it does not matter if it is strictly compatible with the old version or not. What is the difference to the numpy version, though? |
What do you mean? By refactoring the code I will use the random generator in a different way, for example, instead of doing this: ngaps = max(1, ngaps)
wall_pos = list(range(max_length))
rng.shuffle(wall_pos)
gaps_pos = wall_pos[:ngaps]
for gap in gaps_pos:
if vertical:
wall.discard((pos, ymin+gap))
else:
wall.discard((xmin+gap, pos)) I will do something like this (untested): gaps = rng.sample(wall, k=max(1, ngaps))
for gap in gaps:
wall.remove(gap) which is nicer and more readable. Unfortunately the selected gap positions will be different in the two cases, because |
I also have the hard-core version (untested): wall = wall.difference(rng.sample(wall, k=max(1, ngaps))) which is maybe a little too terse ;-) |
Maybe you could rewrite the numpy first to also use |
I suspect that this measure won't be helpful. Since it is about deciding whether the mazes are satisfactory or not, I would have just eyeballed it. We already experienced that our intuition is quite good about a "good" - or better, a "bad" - maze. 😏
Just push it here. It is still about optimizing the maze generation. 🙂
I feel that this would be too much. Personally, I don't see any difference between rng.shuffle(wall)
gaps = wall[:ngaps] and gaps = rng.sample(wall, k=ngaps) So, IMO just go for the rewrite. 👍 |
Really cool. I like how this evolves. 😁 But, did you check this:
or cut in half:
Before, when we searched in this layout for chambers only on one half, it would wrongly detect the dotted region as chamber. |
For reference, the corresponding routines from Pythons for i in reversed(range(1, len(x))):
# pick an element in x[: i + 1] with which to exchange x[i]
j = randbelow(i + 1)
x[i], x[j] = x[j], x[i] n = len(x)
result = [None] * k
pool = list(x)
for i in range(k):
j = randbelow(n - i)
result[i] = pool[j]
pool[j] = pool[n - i - 1] # move non-selected item into vacancy I generated some walls: shuffled
sampled
sample_walls.pyfrom random import Random
NWALLS = 6
NGAPS = NWALLS // 2
rng = Random()
def get_walls(gaps):
walls = ["██"] * NWALLS
for gap in gaps:
walls[gap] = "╶╴"
return "".join(walls)
with open("shuffled.md", "w+") as fshuffle, open("sampled.md", "w+") as fsample:
for i in range(50):
shuffle_x = list(range(NWALLS))
sample_x = list(range(NWALLS))
rng.shuffle(shuffle_x)
shuffled_gaps = shuffle_x[:NGAPS]
sampled_gaps = rng.sample(sample_x, k=NGAPS)
print(f"`{get_walls(shuffled_gaps)}`", file=fshuffle)
print(f"`{get_walls(sampled_gaps)}`", file=fsample) |
I tested it this way: generate 1000 mazes with the code that works on full mazes, including distribution of food. Generate 1000 mazes with the code that works on half mazes. Verify that the food is distributed in exactly the same way. The problem we had does not happen anymore because I cut in half + 1 column of the right border. I think. |
I tested it again. The way I implemented it, it always works because the right side of the maze has no walls except for the external ones. This way any wall that maybe located at the left border can be always walked around going to the right side. The example you posted correctly detects no chambers if you delete all the walls on the right side. |
Ouh, that's brilliant! Super neat that this works now with half the effort. 🤩 |
It just hit me: This is the definition of the routine in n = len(x)
result = [None] * k
pool = list(x)
for i in range(k):
j = randbelow(n - i)
result[i] = pool[j]
pool[j] = pool[n - i - 1] Now lets rewrite this for n = len(x)
result = [None] * n # <--
pool = list(x)
for i in range(n): # <--
j = randbelow(n - i)
result[i] = pool[j]
pool[j] = pool[n - i - 1] Next, we redefine n = len(x)
result = [None] * n
pool = list(x)
for i in reversed(range(1, n)): # <-- i in [1, n), i.e. reversed (n, 1]
j = randbelow(n - (n - i - 1)) # <--
result[n - i - 1] = pool[j] # <--
pool[j] = pool[n - (n - i - 1) - 1] # <-- which becomes n = len(x)
result = [None] * n
pool = list(x)
for i in reversed(range(1, n)): # <-- i in [1, n), i.e. reversed (n, 1]
j = randbelow(i + 1) # <--
result[n - i - 1] = pool[j] # <--
pool[j] = pool[i] # <-- Now let's rearrange the code a bit: n = len(x)
result = [None] * n
pool = list(x)
for i in reversed(range(1, n)):
j = randbelow(i + 1)
result[n - i - 1], pool[j] = pool[j], pool[i] # <-- The reasoning why n = len(x)
result = [None] * n
pool = list(x)
for i in reversed(range(1, n)):
j = randbelow(i + 1)
result[i], pool[j] = pool[j], pool[i] # <-- Since we know by using n = len(x)
for i in reversed(range(1, n)):
j = randbelow(i + 1)
x[i], x[j] = x[j], x[i] # <-- Et voilà, we have our definition of Following my script, I wondered whether the sampled gaps are equal to the reverse of shuffled gaps when using the same seed: SEED = 1
rng = Random(SEED)
rng.shuffle(wall)
shuffled_gaps = wall[:ngaps]
sampled_gaps = rng.sample(wall, k=ngaps)
sampled_gaps == shuffled_gaps[::-1] but it seems to make a difference whether the code runs through But yeah @otizonaizit, I hope to have removed any of your doubts about rewriting the code from ngaps = max(1, ngaps)
wall_pos = list(range(max_length))
rng.shuffle(wall_pos)
gaps_pos = wall_pos[:ngaps]
for gap in gaps_pos:
if vertical:
wall.discard((pos, ymin+gap))
else:
wall.discard((xmin+gap, pos)) to gaps = rng.sample(wall, k=max(1, ngaps))
for gap in gaps:
wall.remove(gap)
I proved that it does not break the algorithm. 😁 |
Hey guys,
analogous to #852, I also tried to use the first version of the optimized chamber finding algorithm in the maze generation as well.
The timings went from several seconds down to less than half a second consistently for script execution.
However, looking at the profiling, only 5% (25 ms) are actually maze generation and the rest of the time was spent in importing:
absolute times
relative times
Would that be fast enough to generate mazes on-the-fly?
For now, I think this would not make the layout database obsolete as one still does not have direct control over the number of dead ends and chambers.