Skip to content

Commit 72a152c

Browse files
author
zilvinas
committed
..
1 parent 514ebdb commit 72a152c

File tree

2 files changed

+24
-48
lines changed

2 files changed

+24
-48
lines changed

dist/about.json

Lines changed: 12 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,23 @@
11
[
2+
"About this project",
3+
"",
24
"----------------------------------------------------------------------------------------------------",
35
"This project involves using model-based reinforcement learning to solve the 15-puzzle problem, ",
46
"a sliding puzzle where the goal is to arrange numbered tiles in the correct order. The approach ",
57
"relies on a Q-Table, a tabular data structure that stores values representing the expected utility of ",
68
"taking specific actions in each possible state. The Q-Table helps the learning agent make decisions ",
79
"by providing a way to evaluate which actions lead to more favorable outcomes over time.",
8-
"",
9-
"In this implementation, key parameters like learning rate and discount factor control how the ",
10-
"agent updates its understanding of the environment. The learning rate (learningRate in ",
11-
"config.json) determines how much new information overrides old knowledge, while the ",
12-
"discount factor (discount) influences how future rewards are valued relative to immediate ones.",
13-
"The configuration allows for the generation of a specified number of training examples (or ",
14-
"'lessons') to teach the model through repeated trials. For each lesson, various parameters such as ",
15-
"goals, start positions, and locked elements are defined, guiding the agent’s training process.",
16-
"The settings in config.json define these learning conditions, providing a structured approach to ",
17-
"gradually mastering the puzzle.",
1810
"----------------------------------------------------------------------------------------------------",
1911
"",
20-
"about puzzle - https://en.wikipedia.org/wiki/15_puzzle",
21-
"project source code - https://github.com/zc42/ml-puzzle-15-v1",
22-
"java port - https://github.com/zc42/game_15_java/blob/main/README.md",
23-
"python port for kaggle - https://www.kaggle.com/code/zilvinasc/reinforced-learning-qtable-puzzle-15",
24-
"development history - https://github.com/zc42/stuff",
12+
"About puzzle - https://en.wikipedia.org/wiki/15_puzzle",
13+
"Project source code - https://github.com/zc42/ml-puzzle-15-v1",
14+
"Java port - https://github.com/zc42/game_15_java/blob/main/README.md",
15+
"Python port for kaggle - https://www.kaggle.com/code/zilvinasc/reinforced-learning-qtable-puzzle-15",
16+
"Development history - https://github.com/zc42/stuff",
2517
"----------------------------------------------------------------------------------------------------",
2618
"",
2719
"",
28-
"CONFIGURATION HELP",
29-
"",
30-
"",
31-
"Configuration is editable.",
20+
"Project configuration properties (configuration is editable)",
3221
"----------------------------------------------------------------------------------------------------",
3322
"usePretrainedDataWhileTesting - boolean, true or false.",
3423
"When set to false, the q-table data learned in this session will be used.",
@@ -49,10 +38,10 @@
4938
"startPositions - number array, starting positions for the empty tile.",
5039
"lockedElements - number array, tiles that are fixed and cannot be moved, usually representing the goals from previous lessons.",
5140
"All number arrays, valid numbers are from 1 to 15, movable tile nubers in a puzzle.",
52-
"",
41+
"-------------------",
5342
"lessonsToGenerate - number, the number of episodes that will be generated for this particular lesson in each training cycle.",
5443
"----------------------------------------------------------------------------------------------------",
55-
"----------------------------------------------------------------------------------------------------",
44+
"",
5645
"",
5746
"Some theory",
5847
"----------------------------------------------------------------------------------------------------",
@@ -77,12 +66,11 @@
7766
"The Q-Table helps an agent learn which actions to take in each state to maximize its cumulative reward over time.",
7867
"By iteratively updating the Q-values based on experiences, the agent learns an optimal policy for decision-making.",
7968
"----------------------------------------------------------------------------------------------------",
80-
"",
8169
"Model-based reinforcement learning",
8270
"",
8371
"Model-based reinforcement learning is an approach where the agent learns a model of the ",
8472
"environment, including how actions affect states and the rewards they yield. The agent uses ",
8573
"this model to simulate future outcomes and plan its actions, making decisions based on predictions ",
86-
"rather than just past experiences. This can improve learning efficiency since the agent can learn ",
87-
"from simulated interactions, but it depends on the accuracy of the learned model."
74+
"rather than just past experiences. This can improve learning efficiency since the agent can learn ",
75+
"from simulated interactions, but it depends on the accuracy of the learned model."
8876
]

public/about.json

Lines changed: 12 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,23 @@
11
[
2+
"About this project",
3+
"",
24
"----------------------------------------------------------------------------------------------------",
35
"This project involves using model-based reinforcement learning to solve the 15-puzzle problem, ",
46
"a sliding puzzle where the goal is to arrange numbered tiles in the correct order. The approach ",
57
"relies on a Q-Table, a tabular data structure that stores values representing the expected utility of ",
68
"taking specific actions in each possible state. The Q-Table helps the learning agent make decisions ",
79
"by providing a way to evaluate which actions lead to more favorable outcomes over time.",
8-
"",
9-
"In this implementation, key parameters like learning rate and discount factor control how the ",
10-
"agent updates its understanding of the environment. The learning rate (learningRate in ",
11-
"config.json) determines how much new information overrides old knowledge, while the ",
12-
"discount factor (discount) influences how future rewards are valued relative to immediate ones.",
13-
"The configuration allows for the generation of a specified number of training examples (or ",
14-
"'lessons') to teach the model through repeated trials. For each lesson, various parameters such as ",
15-
"goals, start positions, and locked elements are defined, guiding the agent’s training process.",
16-
"The settings in config.json define these learning conditions, providing a structured approach to ",
17-
"gradually mastering the puzzle.",
1810
"----------------------------------------------------------------------------------------------------",
1911
"",
20-
"about puzzle - https://en.wikipedia.org/wiki/15_puzzle",
21-
"project source code - https://github.com/zc42/ml-puzzle-15-v1",
22-
"java port - https://github.com/zc42/game_15_java/blob/main/README.md",
23-
"python port for kaggle - https://www.kaggle.com/code/zilvinasc/reinforced-learning-qtable-puzzle-15",
24-
"development history - https://github.com/zc42/stuff",
12+
"About puzzle - https://en.wikipedia.org/wiki/15_puzzle",
13+
"Project source code - https://github.com/zc42/ml-puzzle-15-v1",
14+
"Java port - https://github.com/zc42/game_15_java/blob/main/README.md",
15+
"Python port for kaggle - https://www.kaggle.com/code/zilvinasc/reinforced-learning-qtable-puzzle-15",
16+
"Development history - https://github.com/zc42/stuff",
2517
"----------------------------------------------------------------------------------------------------",
2618
"",
2719
"",
28-
"CONFIGURATION HELP",
29-
"",
30-
"",
31-
"Configuration is editable.",
20+
"Project configuration properties (configuration is editable)",
3221
"----------------------------------------------------------------------------------------------------",
3322
"usePretrainedDataWhileTesting - boolean, true or false.",
3423
"When set to false, the q-table data learned in this session will be used.",
@@ -49,10 +38,10 @@
4938
"startPositions - number array, starting positions for the empty tile.",
5039
"lockedElements - number array, tiles that are fixed and cannot be moved, usually representing the goals from previous lessons.",
5140
"All number arrays, valid numbers are from 1 to 15, movable tile nubers in a puzzle.",
52-
"",
41+
"-------------------",
5342
"lessonsToGenerate - number, the number of episodes that will be generated for this particular lesson in each training cycle.",
5443
"----------------------------------------------------------------------------------------------------",
55-
"----------------------------------------------------------------------------------------------------",
44+
"",
5645
"",
5746
"Some theory",
5847
"----------------------------------------------------------------------------------------------------",
@@ -77,12 +66,11 @@
7766
"The Q-Table helps an agent learn which actions to take in each state to maximize its cumulative reward over time.",
7867
"By iteratively updating the Q-values based on experiences, the agent learns an optimal policy for decision-making.",
7968
"----------------------------------------------------------------------------------------------------",
80-
"",
8169
"Model-based reinforcement learning",
8270
"",
8371
"Model-based reinforcement learning is an approach where the agent learns a model of the ",
8472
"environment, including how actions affect states and the rewards they yield. The agent uses ",
8573
"this model to simulate future outcomes and plan its actions, making decisions based on predictions ",
86-
"rather than just past experiences. This can improve learning efficiency since the agent can learn ",
87-
"from simulated interactions, but it depends on the accuracy of the learned model."
74+
"rather than just past experiences. This can improve learning efficiency since the agent can learn ",
75+
"from simulated interactions, but it depends on the accuracy of the learned model."
8876
]

0 commit comments

Comments
 (0)