1
1
[
2
+ " About this project" ,
3
+ " " ,
2
4
" ----------------------------------------------------------------------------------------------------" ,
3
5
" This project involves using model-based reinforcement learning to solve the 15-puzzle problem, " ,
4
6
" a sliding puzzle where the goal is to arrange numbered tiles in the correct order. The approach " ,
5
7
" relies on a Q-Table, a tabular data structure that stores values representing the expected utility of " ,
6
8
" taking specific actions in each possible state. The Q-Table helps the learning agent make decisions " ,
7
9
" by providing a way to evaluate which actions lead to more favorable outcomes over time." ,
8
- " " ,
9
- " In this implementation, key parameters like learning rate and discount factor control how the " ,
10
- " agent updates its understanding of the environment. The learning rate (learningRate in " ,
11
- " config.json) determines how much new information overrides old knowledge, while the " ,
12
- " discount factor (discount) influences how future rewards are valued relative to immediate ones." ,
13
- " The configuration allows for the generation of a specified number of training examples (or " ,
14
- " 'lessons') to teach the model through repeated trials. For each lesson, various parameters such as " ,
15
- " goals, start positions, and locked elements are defined, guiding the agent’s training process." ,
16
- " The settings in config.json define these learning conditions, providing a structured approach to " ,
17
- " gradually mastering the puzzle." ,
18
10
" ----------------------------------------------------------------------------------------------------" ,
19
11
" " ,
20
- " about puzzle - https://en.wikipedia.org/wiki/15_puzzle" ,
21
- " project source code - https://github.com/zc42/ml-puzzle-15-v1" ,
22
- " java port - https://github.com/zc42/game_15_java/blob/main/README.md" ,
23
- " python port for kaggle - https://www.kaggle.com/code/zilvinasc/reinforced-learning-qtable-puzzle-15" ,
24
- " development history - https://github.com/zc42/stuff" ,
12
+ " About puzzle - https://en.wikipedia.org/wiki/15_puzzle" ,
13
+ " Project source code - https://github.com/zc42/ml-puzzle-15-v1" ,
14
+ " Java port - https://github.com/zc42/game_15_java/blob/main/README.md" ,
15
+ " Python port for kaggle - https://www.kaggle.com/code/zilvinasc/reinforced-learning-qtable-puzzle-15" ,
16
+ " Development history - https://github.com/zc42/stuff" ,
25
17
" ----------------------------------------------------------------------------------------------------" ,
26
18
" " ,
27
19
" " ,
28
- " CONFIGURATION HELP" ,
29
- " " ,
30
- " " ,
31
- " Configuration is editable." ,
20
+ " Project configuration properties (configuration is editable)" ,
32
21
" ----------------------------------------------------------------------------------------------------" ,
33
22
" usePretrainedDataWhileTesting - boolean, true or false." ,
34
23
" When set to false, the q-table data learned in this session will be used." ,
49
38
" startPositions - number array, starting positions for the empty tile." ,
50
39
" lockedElements - number array, tiles that are fixed and cannot be moved, usually representing the goals from previous lessons." ,
51
40
" All number arrays, valid numbers are from 1 to 15, movable tile nubers in a puzzle." ,
52
- " " ,
41
+ " ------------------- " ,
53
42
" lessonsToGenerate - number, the number of episodes that will be generated for this particular lesson in each training cycle." ,
54
43
" ----------------------------------------------------------------------------------------------------" ,
55
- " ---------------------------------------------------------------------------------------------------- " ,
44
+ " " ,
56
45
" " ,
57
46
" Some theory" ,
58
47
" ----------------------------------------------------------------------------------------------------" ,
77
66
" The Q-Table helps an agent learn which actions to take in each state to maximize its cumulative reward over time." ,
78
67
" By iteratively updating the Q-values based on experiences, the agent learns an optimal policy for decision-making." ,
79
68
" ----------------------------------------------------------------------------------------------------" ,
80
- " " ,
81
69
" Model-based reinforcement learning" ,
82
70
" " ,
83
71
" Model-based reinforcement learning is an approach where the agent learns a model of the " ,
84
72
" environment, including how actions affect states and the rewards they yield. The agent uses " ,
85
73
" this model to simulate future outcomes and plan its actions, making decisions based on predictions " ,
86
- " rather than just past experiences. This can improve learning efficiency since the agent can learn " ,
87
- " from simulated interactions, but it depends on the accuracy of the learned model."
74
+ " rather than just past experiences. This can improve learning efficiency since the agent can learn " ,
75
+ " from simulated interactions, but it depends on the accuracy of the learned model."
88
76
]
0 commit comments