diff --git a/Demos/Stratego/cartpole/README.md b/Demos/Stratego/cartpole/README.md
new file mode 100644
index 0000000..9cf7e74
--- /dev/null
+++ b/Demos/Stratego/cartpole/README.md
@@ -0,0 +1,107 @@
+# CartPole model for UPPAAL Stratego
+
+This model is a [UPPAAL Stratego][1] implementation of the classical
+reinforcement learning problem, where an agent has to balance a pole vertically
+on a moving cart. The model corresponds to the [Gymnasium environment][4] that
+is widely used in testing RL techniques.
+
+![cartpole-gif](./reinforcement-learning-cartpole-v0.gif)
+([source][2])
+
+
+## Model description
+
+The model consists of an always moving cart that can either move left or
+right on a flat, horizontal surface. On top of the cart, a pole is balancing and
+the learning agent has to keep the pole standing upright by changing the
+direction of the cart without letting it travel to far in either direction.
+The agent is handed control every 0.02 seconds and then has to decide in which
+direction to push the cart.
+
+![agent template](./imgs/agent-screenshot.png)
+
+The observable state space for the agent is 4-dimensional and consists of the
+position and velocity of the cart, the angle of the pole and the velocity with
+which the pole is falling. In the initial location of the model, all these
+variables are instantiated with a random value between -0.05 and 0.05. The main
+location for the CartPole model is when it is alive. Here the position of the
+cart and the angle of the pole are set to change according to the corresponding
+velocities. On the other hand, the velocities are updated by a set of functions
+which are described in detail in [the paper][3] by Florian,
+R. (2007). They are quite intricate and depends on the mass of the cart, the
+mass and length of the pole and the magnitude of the force that gets applied.
+
+In this model, the functions are written up so it is easy to change these
+values but by default they are set to match the configuration of the Gymnasium
+environment that this model is supposed to correspond to.
+
+![acceleration functions](./imgs/functions-screenshot.png)
+
+Whenever the agent takes an action, the CartPole model visits the IsDead
+location, where there effectively happens a check for whether the agent has lost
+or is able to continue. There are four ways to lose: the cart is to far to the
+left, the cart is to far to the right, the pole is to low on the left or the
+pole is to low on the right. If none of these are true, the model transitions
+back to the Alive location. Otherwise, the agent is terminated, the system is
+reset and the death counter is increased by one.
+
+![cartpole template](./imgs/cartpole-screenshot.png)
+
+
+## Training and evaluation
+
+For evaluation, we will start by getting UPPAAL Stratego to calculate the
+expected number of deaths when we apply random control. The query `E[<=10;1000]
+(max: CartPole.num_deaths)` estimates the maximal number of deaths over 10
+seconds on the basis of 1000 simulated runs. We see, that this estimation lies
+about 23, which indicates a pretty poor control strategy.
+
+We therefore train a strategy that has access to the aforementioned state
+variables and has the objective to minimize the number of deaths. The query
+`strategy StayAlive = minE (CartPole.num_deaths) [<=10] {} ->
+{CartPole.cart_pos, CartPole.cart_vel, CartPole.pole_ang, CartPole.pole_vel}:
+<> time >= 10` does the job.
+
+Now we can rerun the estimation query, but this time appending `under StayAlive`
+to it in order to utilize our newfound strategy. We now see a dramatically
+better performance as the estimated maximal number of deaths is less than
+0.01 (might vary a bit from each execution of the query)! This means that UPPAAL
+Stratego has done what every other RL technique worth its salt should be able to
+do: solve the cartpole problem!
+
+
+## Evaluating in the Gymnasium environment
+
+We can assert that the strategy learned by UPPAAL Stratego is actually useful
+outside Stratego. [Gymnasium][4] is a widely used Python framework for working
+with a wide range of standard RL environments such as cartpole, and since this
+model is based on that specific implementation, our Stratego strategies should
+work in the Gymnasium setting. In the directory of this model are some Python
+files that allow you to try this out!
+
+First, be a good pythonista and create and activate a virtual environment using
+your favorite method to do so (or just the easy method: `python3 -m venv ./env
+&& source env/bin/activate`). Then run the command `pip install -r
+requirements.txt` which will install Gymnasium and [stratetrees][5], a small
+package designed to work with UPPAAL Stratego strategies in a Python setting.
+
+Now you need to save a UPPAAL strategy as a json file. In UPPAAL, create a new
+query that says `saveStrategy("/path/to/somewhere/strategy.json", StayAlive)`
+(if your strategy is not called 'StayAlive', you should obviously write
+something else at the end) and run it. Now you should be able to run the command
+`python main.py --strategy /path/to/somewhere/strategy.json` and see your UPPAAL
+Stratego strategy being applied on the Gymnasium environment!
+
+As the output suggest, a perfect score would be to reach a mean of 500. That
+will probably not be the case (more likely something like 490). This indicates
+that the UPPAAL strategy is not perfect and has not solved the problem
+completely. You can try and tinker with the learning parameters in UPPAAL
+(Options -> Learning parameters...) or you can increase the number of seconds
+in the training queries above (eg. change 10 to 20 to force it to learn to
+balance for a longer time period).
+
+[1]: https://people.cs.aau.dk/~marius/stratego/
+[2]: https://tenor.com/view/reinforcement-learning-cartpole-v0-tensorflow-open-ai-gif-18474251
+[3]: https://coneural.org/florian/papers/05_cart_pole.pdf
+[4]: https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/classic_control/cartpole.py
+[5]: https://pypi.org/project/stratetrees/
diff --git a/Demos/Stratego/cartpole/cartpole.xml b/Demos/Stratego/cartpole/cartpole.xml
new file mode 100644
index 0000000..f78a340
--- /dev/null
+++ b/Demos/Stratego/cartpole/cartpole.xml
@@ -0,0 +1,490 @@
+<?xml version="1.0" encoding="utf-8"?>
+<!DOCTYPE nta PUBLIC '-//Uppaal Team//DTD Flat System 1.6//EN' 'http://www.it.uu.se/research/group/darts/uppaal/flat-1_6.dtd'>
+<nta>
+	<declaration>/**
+Implementation of the classical CartPole environment.
+Author: Andreas Holck Høeg-Petersen
+
+Based on the Gymnasium env (https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/classic_control/cartpole.py)
+which uses the equations from Florian, R. 2007 (https://coneural.org/florian/papers/05_cart_pole.pdf)
+
+The system models a cart that can be pushed left or right with a pole balancing
+on top of the cart. The objective of the learning agent is to keep the pole
+from falling for 10 seconds and not pushing the cart to far in either 
+direction. Every time the agent fails, the system is reset and time continues. 
+We thus count the number of deaths during a single run and seek to minimize 
+this number.
+**/
+
+broadcast chan left, right;
+clock time;</declaration>
+	<template>
+		<name>Agent</name>
+		<declaration>clock t;</declaration>
+		<location id="id0" x="-76" y="-68">
+			<name x="-86" y="-102">Move</name>
+			<committed/>
+		</location>
+		<location id="id1" x="-76" y="33">
+			<name x="-86" y="-1">W</name>
+			<label kind="invariant" x="-59" y="42">t&lt;=0.02</label>
+			<label kind="exponentialrate" x="-93" y="51">1</label>
+		</location>
+		<init ref="id1"/>
+		<transition id="id2">
+			<source ref="id0"/>
+			<target ref="id1"/>
+			<label kind="synchronisation" x="-229" y="-26">left!</label>
+			<nail x="-187" y="-68"/>
+			<nail x="-187" y="33"/>
+		</transition>
+		<transition id="id3">
+			<source ref="id0"/>
+			<target ref="id1"/>
+			<label kind="synchronisation" x="43" y="-26">right!</label>
+			<nail x="43" y="-68"/>
+			<nail x="43" y="33"/>
+		</transition>
+		<transition id="id4" controllable="false">
+			<source ref="id1"/>
+			<target ref="id0"/>
+			<label kind="guard" x="-68" y="-9">t==0.02</label>
+			<label kind="assignment" x="-68" y="-26">t=0.0</label>
+		</transition>
+	</template>
+	<template>
+		<name x="5" y="5">CartPole</name>
+		<declaration>
+/// count the number of times the pole falls (minimization objective)
+int num_deaths = 0;
+
+/// force applied to cart (control variable)
+double force = 0;
+
+/// position and angle of the cart, angle and velocity of the pole (state variables)
+clock cart_pos, cart_vel, pole_ang, pole_vel;
+
+/// thresholds (+/-) for when the game is considered lost
+const double cart_thresh = 2.4;
+const double pole_thresh = (12 * 2 * M_PI) / 360;
+
+/// constants for physics
+const double G = 9.8;           /// gravity
+const double mc = 1.0;          /// mass cart
+const double mp = 0.1;          /// mass pole
+const double l = 0.5;           /// pole length (half)
+const double mass = mp + mc;    /// total mass
+const double mp_l = mp * l;     /// pole mass times its length
+const double force_mag = 10.0;  /// magnitude of force applied
+
+/// value to be used in both acceleration equations
+double temp() {
+    return (force + mp_l * pow(pole_vel, 2) * sin(pole_ang)) / mass;
+}
+
+/// pole acceleration
+double poleAcc() {
+    return (G * sin(pole_ang) - cos(pole_ang) * temp()) / (
+        l * (4.0 / 3.0 - (mp * pow(cos(pole_ang), 2) / mass))
+    );
+}
+
+/// cart acceleration
+double cartAcc() {
+    return temp() - mp_l * poleAcc() * cos(pole_ang) / mass;
+}</declaration>
+		<location id="id5" x="-229" y="-357">
+			<name x="-280" y="-400">Terminated</name>
+			<label kind="invariant" x="-340" y="-442">cart_pos' == 0 &amp;&amp; cart_vel' == 0 &amp;&amp;
+pole_ang' == 0 &amp;&amp; pole_vel' == 0</label>
+			<committed/>
+		</location>
+		<location id="id6" x="-229" y="-17">
+			<name x="-280" y="-42">Alive</name>
+			<label kind="invariant" x="-212" y="-8">cart_pos' == cart_vel &amp;&amp; cart_vel' == cartAcc() &amp;&amp;
+pole_ang' == pole_vel &amp;&amp; pole_vel' == poleAcc()</label>
+		</location>
+		<location id="id7" x="-229" y="102">
+			<name x="-239" y="68">Initial</name>
+			<urgent/>
+		</location>
+		<location id="id8" x="-229" y="-161">
+			<name x="-221" y="-153">IsDead</name>
+			<committed/>
+		</location>
+		<location id="id9" x="-603" y="-289">
+			<name x="-637" y="-323">CartToFarLeft</name>
+			<committed/>
+		</location>
+		<location id="id10" x="-408" y="-289">
+			<name x="-442" y="-323">CartToFarRight</name>
+			<committed/>
+		</location>
+		<location id="id11" x="-34" y="-289">
+			<name x="-93" y="-323">PoleToLowLeft</name>
+			<committed/>
+		</location>
+		<location id="id12" x="136" y="-289">
+			<name x="102" y="-323">PoleToLowRight</name>
+			<committed/>
+		</location>
+		<init ref="id7"/>
+		<transition id="id13" controllable="false">
+			<source ref="id8"/>
+			<target ref="id12"/>
+			<label kind="guard" x="-25" y="-229">pole_ang &gt; pole_thresh</label>
+		</transition>
+		<transition id="id14" controllable="false">
+			<source ref="id8"/>
+			<target ref="id11"/>
+			<label kind="guard" x="-221" y="-272">pole_ang &lt; -pole_thresh</label>
+		</transition>
+		<transition id="id15" controllable="false">
+			<source ref="id8"/>
+			<target ref="id10"/>
+			<label kind="guard" x="-459" y="-272">cart_pos &gt; cart_thresh</label>
+		</transition>
+		<transition id="id16" controllable="false">
+			<source ref="id8"/>
+			<target ref="id9"/>
+			<label kind="guard" x="-612" y="-229">cart_pos &lt; -cart_thresh</label>
+		</transition>
+		<transition id="id17" controllable="false">
+			<source ref="id12"/>
+			<target ref="id5"/>
+		</transition>
+		<transition id="id18" controllable="false">
+			<source ref="id11"/>
+			<target ref="id5"/>
+		</transition>
+		<transition id="id19" controllable="false">
+			<source ref="id10"/>
+			<target ref="id5"/>
+		</transition>
+		<transition id="id20" controllable="false">
+			<source ref="id9"/>
+			<target ref="id5"/>
+		</transition>
+		<transition id="id21" controllable="false">
+			<source ref="id8"/>
+			<target ref="id6"/>
+			<label kind="guard" x="-221" y="-127">cart_pos &gt;= -cart_thresh &amp;&amp;
+cart_pos &lt;= cart_thresh &amp;&amp;
+pole_ang &gt;= -1 * pole_thresh &amp;&amp; 
+pole_ang &lt;= pole_thresh</label>
+		</transition>
+		<transition id="id22" controllable="false">
+			<source ref="id5"/>
+			<target ref="id7"/>
+			<label kind="assignment" x="-773" y="-127">num_deaths++</label>
+			<nail x="-671" y="-357"/>
+			<nail x="-671" y="102"/>
+		</transition>
+		<transition id="id23" controllable="false">
+			<source ref="id7"/>
+			<target ref="id6"/>
+			<label kind="assignment" x="-204" y="93">cart_pos = -0.05 + random(0.1), cart_vel = -0.05 + random(0.1),
+pole_ang = -0.05 + random(0.1), pole_vel = -0.05 + random(0.1)</label>
+		</transition>
+		<transition id="id24">
+			<source ref="id6"/>
+			<target ref="id8"/>
+			<label kind="synchronisation" x="59" y="-110">left?</label>
+			<label kind="assignment" x="59" y="-85">force = -1 * force_mag</label>
+			<nail x="42" y="-17"/>
+			<nail x="42" y="-161"/>
+		</transition>
+		<transition id="id25">
+			<source ref="id6"/>
+			<target ref="id8"/>
+			<label kind="synchronisation" x="-510" y="-76">right?</label>
+			<label kind="assignment" x="-510" y="-59">force = force_mag</label>
+			<nail x="-357" y="-17"/>
+			<nail x="-357" y="-161"/>
+		</transition>
+	</template>
+	<system>system Agent, CartPole;
+</system>
+	<queries>
+		<query>
+			<formula>// How often do we die with random control?</formula>
+			<comment/>
+		</query>
+		<query>
+			<formula>E[&lt;=10;1000] (max: CartPole.num_deaths)</formula>
+			<comment>Expected number of deaths with random control.
+Should finish in a couple of seconds and be around 23.</comment>
+			<result outcome="success" type="quantity" value="23.77 ± 0.162791 (95% CI)" timestamp="2023-07-06 10:55:18 +0200">
+				<details>23.77 ± 0.162791 (95% CI)</details>
+				<plot title="Probability Density Distribution" xaxis="max: CartPole.num_deaths" yaxis="probability density">
+					<series title="density" type="b(1.000000)" color="0x0000ff" encoding="csv">15.0,0.001
+16.0,0.003
+17.0,0.004
+18.0,0.015
+19.0,0.032
+20.0,0.054
+21.0,0.087
+22.0,0.115
+23.0,0.15
+24.0,0.139
+25.0,0.125
+26.0,0.123
+27.0,0.079
+28.0,0.045
+29.0,0.02
+30.0,0.006
+31.0,0.002
+					</series>
+					<series title="average" type="pl" color="0x00dd00" encoding="csv">23.77,0.0
+23.77,0.15
+					</series>
+					<comment>Parameters: α=0.05, ε=0.05, bucket width=1, bucket count=17
+Runs: 1000 in total, 1000 (100%) displayed, 0 (0%) remaining
+Span of displayed sample: [15, 31]
+Mean estimate of displayed sample: 23.77 ± 0.16279 (95% CI)</comment>
+				</plot>
+				<plot title="Probability Distribution" xaxis="max: CartPole.num_deaths" yaxis="probability">
+					<series title="probability" type="b(1.000000)" color="0x0000ff" encoding="csv">15.0,0.001
+16.0,0.003
+17.0,0.004
+18.0,0.015
+19.0,0.032
+20.0,0.054
+21.0,0.087
+22.0,0.115
+23.0,0.15
+24.0,0.139
+25.0,0.125
+26.0,0.123
+27.0,0.079
+28.0,0.045
+29.0,0.02
+30.0,0.006
+31.0,0.002
+					</series>
+					<series title="average" type="pl" color="0x00dd00" encoding="csv">23.77,0.0
+23.77,0.15
+					</series>
+					<comment>Parameters: α=0.05, ε=0.05, bucket width=1, bucket count=17
+Runs: 1000 in total, 1000 (100%) displayed, 0 (0%) remaining
+Span of displayed sample: [15, 31]
+Mean estimate of displayed sample: 23.77 ± 0.16279 (95% CI)</comment>
+				</plot>
+				<plot title="Cumulative Probability Distribution" xaxis="max: CartPole.num_deaths" yaxis="probability">
+					<series title="cumulative" type="l" color="0x000000" encoding="csv">15.0,0.0
+16.0,0.001
+17.0,0.004
+18.0,0.008
+19.0,0.023
+20.0,0.055
+21.0,0.109
+22.0,0.196
+23.0,0.311
+24.0,0.461
+25.0,0.6
+26.0,0.725
+27.0,0.848
+28.0,0.927
+29.0,0.972
+30.0,0.992
+31.0,0.998
+					</series>
+					<series title="average" type="pl" color="0x00dd00" encoding="csv">23.77,0.0
+23.77,1.0
+					</series>
+					<comment>Parameters: α=0.05, ε=0.05, bucket width=1, bucket count=17
+Runs: 1000 in total, 1000 (100%) displayed, 0 (0%) remaining
+Span of displayed sample: [15, 31]
+Mean estimate of displayed sample: 23.77 ± 0.16279 (95% CI)</comment>
+				</plot>
+				<plot title="Cumulative Probability Confidence Intervals" xaxis="max: CartPole.num_deaths" yaxis="probability">
+					<series title="upper limit" type="k" color="0x0000dd" encoding="csv">15.0,0.003682083896865672
+16.0,0.005558924279826673
+17.0,0.010209664683929873
+18.0,0.015702049176074685
+19.0,0.03431233761285437
+20.0,0.07099151554205749
+21.0,0.12997836534293516
+22.0,0.22198406428487183
+23.0,0.34071618137621124
+24.0,0.4924717630817992
+25.0,0.6305310124510876
+26.0,0.7524786563444994
+27.0,0.869701749675018
+28.0,0.9423463747720674
+29.0,0.9813153138044625
+30.0,0.9965400238346707
+31.0,0.9997576988831227
+					</series>
+					<series title="lower limit" type="k" color="0xdd0000" encoding="csv">15.0,0.0
+16.0,2.5317487491294045E-5
+17.0,0.0010909079877259719
+18.0,0.003459976165329311
+19.0,0.014634582325176757
+20.0,0.04169879507953599
+21.0,0.09035814767691946
+22.0,0.17181996603357735
+23.0,0.28240130538824093
+24.0,0.4297584650591189
+25.0,0.5688784459558931
+26.0,0.6961899225470708
+27.0,0.8242275312668873
+28.0,0.9090858246054988
+29.0,0.9597851126261567
+30.0,0.9842979508239253
+31.0,0.9927941610885425
+					</series>
+					<series title="cumulative" type="l" color="0x000000" encoding="csv">15.0,0.0
+16.0,0.001
+17.0,0.004
+18.0,0.008
+19.0,0.023
+20.0,0.055
+21.0,0.109
+22.0,0.196
+23.0,0.311
+24.0,0.461
+25.0,0.6
+26.0,0.725
+27.0,0.848
+28.0,0.927
+29.0,0.972
+30.0,0.992
+31.0,0.998
+					</series>
+					<series title="average" type="pl" color="0x00dd00" encoding="csv">23.77,0.0
+23.77,1.0
+					</series>
+					<comment>Parameters: α=0.05, ε=0.05, bucket width=1, bucket count=17
+Runs: 1000 in total, 1000 (100%) displayed, 0 (0%) remaining
+Span of displayed sample: [15, 31]
+Mean estimate of displayed sample: 23.77 ± 0.16279 (95% CI)</comment>
+				</plot>
+				<plot title="Frequency Histogram" xaxis="max: CartPole.num_deaths" yaxis="count">
+					<series title="count" type="b(1.000000)" color="0x0000ff" encoding="csv">15.0,1.0
+16.0,3.0
+17.0,4.0
+18.0,15.0
+19.0,32.0
+20.0,54.0
+21.0,87.0
+22.0,115.0
+23.0,150.0
+24.0,139.0
+25.0,125.0
+26.0,123.0
+27.0,79.0
+28.0,45.0
+29.0,20.0
+30.0,6.0
+31.0,2.0
+					</series>
+					<series title="average" type="pl" color="0x00dd00" encoding="csv">23.77,0.0
+23.77,150.0
+					</series>
+					<comment>Parameters: α=0.05, ε=0.05, bucket width=1, bucket count=17
+Runs: 1000 in total, 1000 (100%) displayed, 0 (0%) remaining
+Span of displayed sample: [15, 31]
+Mean estimate of displayed sample: 23.77 ± 0.16279 (95% CI)</comment>
+				</plot>
+			</result>
+		</query>
+		<query>
+			<formula/>
+			<comment/>
+		</query>
+		<query>
+			<formula>// Train strategy</formula>
+			<comment/>
+		</query>
+		<query>
+			<formula>strategy StayAlive = minE (CartPole.num_deaths) [&lt;=10] {} -&gt; {CartPole.cart_pos, CartPole.cart_vel, CartPole.pole_ang, CartPole.pole_vel}: &lt;&gt; time &gt;= 10</formula>
+			<comment>Train a strategy from a partially observable state space that minimizes the number of deaths over a 10 seconds run.
+Should find a strategy within a minute.</comment>
+			<result outcome="success" type="quality" timestamp="2023-07-06 13:49:22 +0200">
+			</result>
+		</query>
+		<query>
+			<formula/>
+			<comment/>
+		</query>
+		<query>
+			<formula>// How often do we die with trained controller?</formula>
+			<comment/>
+		</query>
+		<query>
+			<formula>E[&lt;=10;1000] (max: CartPole.num_deaths) under StayAlive</formula>
+			<comment>Expected number of deaths under the well trained agent.
+Should finish in a couple of seconds and be around 0.01.</comment>
+			<result outcome="success" type="quantity" value="0.001 ± 0.00196234 (95% CI)" timestamp="2023-07-06 13:49:31 +0200">
+				<details>0.001 ± 0.00196234 (95% CI)</details>
+				<plot title="Probability Density Distribution" xaxis="max: CartPole.num_deaths" yaxis="probability density">
+					<series title="density" type="b(1.000000)" color="0x0000ff" encoding="csv">0.0,0.999
+1.0,0.001
+					</series>
+					<series title="average" type="pl" color="0x00dd00" encoding="csv">0.001,0.0
+0.001,0.999
+					</series>
+					<comment>Parameters: α=0.05, ε=0.05, bucket width=1, bucket count=2
+Runs: 1000 in total, 1000 (100%) displayed, 0 (0%) remaining
+Span of displayed sample: [0, 1]
+Mean estimate of displayed sample: 0.001 ± 0.00196 (95% CI)</comment>
+				</plot>
+				<plot title="Probability Distribution" xaxis="max: CartPole.num_deaths" yaxis="probability">
+					<series title="probability" type="b(1.000000)" color="0x0000ff" encoding="csv">0.0,0.999
+1.0,0.001
+					</series>
+					<series title="average" type="pl" color="0x00dd00" encoding="csv">0.001,0.0
+0.001,0.999
+					</series>
+					<comment>Parameters: α=0.05, ε=0.05, bucket width=1, bucket count=2
+Runs: 1000 in total, 1000 (100%) displayed, 0 (0%) remaining
+Span of displayed sample: [0, 1]
+Mean estimate of displayed sample: 0.001 ± 0.00196 (95% CI)</comment>
+				</plot>
+				<plot title="Cumulative Probability Distribution" xaxis="max: CartPole.num_deaths" yaxis="probability">
+					<series title="cumulative" type="l" color="0x000000" encoding="csv">0.0,0.0
+1.0,0.999
+					</series>
+					<series title="average" type="pl" color="0x00dd00" encoding="csv">0.001,0.0
+0.001,1.0
+					</series>
+					<comment>Parameters: α=0.05, ε=0.05, bucket width=1, bucket count=2
+Runs: 1000 in total, 1000 (100%) displayed, 0 (0%) remaining
+Span of displayed sample: [0, 1]
+Mean estimate of displayed sample: 0.001 ± 0.00196 (95% CI)</comment>
+				</plot>
+				<plot title="Cumulative Probability Confidence Intervals" xaxis="max: CartPole.num_deaths" yaxis="probability">
+					<series title="upper limit" type="k" color="0x0000dd" encoding="csv">0.0,0.003682083896865672
+1.0,0.9999746825125088
+					</series>
+					<series title="lower limit" type="k" color="0xdd0000" encoding="csv">0.0,0.0
+1.0,0.9944410757201734
+					</series>
+					<series title="cumulative" type="l" color="0x000000" encoding="csv">0.0,0.0
+1.0,0.999
+					</series>
+					<series title="average" type="pl" color="0x00dd00" encoding="csv">0.001,0.0
+0.001,1.0
+					</series>
+					<comment>Parameters: α=0.05, ε=0.05, bucket width=1, bucket count=2
+Runs: 1000 in total, 1000 (100%) displayed, 0 (0%) remaining
+Span of displayed sample: [0, 1]
+Mean estimate of displayed sample: 0.001 ± 0.00196 (95% CI)</comment>
+				</plot>
+				<plot title="Frequency Histogram" xaxis="max: CartPole.num_deaths" yaxis="count">
+					<series title="count" type="b(1.000000)" color="0x0000ff" encoding="csv">0.0,999.0
+1.0,1.0
+					</series>
+					<series title="average" type="pl" color="0x00dd00" encoding="csv">0.001,0.0
+0.001,999.0
+					</series>
+					<comment>Parameters: α=0.05, ε=0.05, bucket width=1, bucket count=2
+Runs: 1000 in total, 1000 (100%) displayed, 0 (0%) remaining
+Span of displayed sample: [0, 1]
+Mean estimate of displayed sample: 0.001 ± 0.00196 (95% CI)</comment>
+				</plot>
+			</result>
+		</query>
+	</queries>
+</nta>
diff --git a/Demos/Stratego/cartpole/imgs/agent-screenshot.png b/Demos/Stratego/cartpole/imgs/agent-screenshot.png
new file mode 100644
index 0000000..337cc13
Binary files /dev/null and b/Demos/Stratego/cartpole/imgs/agent-screenshot.png differ
diff --git a/Demos/Stratego/cartpole/imgs/cartpole-screenshot.png b/Demos/Stratego/cartpole/imgs/cartpole-screenshot.png
new file mode 100644
index 0000000..0ec8aa0
Binary files /dev/null and b/Demos/Stratego/cartpole/imgs/cartpole-screenshot.png differ
diff --git a/Demos/Stratego/cartpole/imgs/functions-screenshot.png b/Demos/Stratego/cartpole/imgs/functions-screenshot.png
new file mode 100644
index 0000000..c3ec0dd
Binary files /dev/null and b/Demos/Stratego/cartpole/imgs/functions-screenshot.png differ
diff --git a/Demos/Stratego/cartpole/main.py b/Demos/Stratego/cartpole/main.py
new file mode 100644
index 0000000..4d0ad36
--- /dev/null
+++ b/Demos/Stratego/cartpole/main.py
@@ -0,0 +1,52 @@
+import argparse
+import numpy as np
+import gymnasium as gym
+
+from tqdm import tqdm
+from stratetrees.models import QTree
+
+
+def parse_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        '--strategy', '-s', type=str,
+        help='Path to the json file containing the UPPAAL Stratego strategy'
+    )
+    parser.add_argument(
+        '--epochs', '-e', type=int, default=1000,
+        help='Number of epochs to run'
+    )
+    return parser.parse_args()
+
+
+if __name__ == '__main__':
+    args = parse_args()
+    model = QTree(args.strategy)
+
+    env = gym.make('CartPole-v1')
+
+    print(f'Evaluating {args.strategy} on {args.epochs} episodes')
+
+    rewards = []
+    for epoch in tqdm(range(args.epochs)):
+
+        obs, info = env.reset()
+        epoch_reward = 0
+
+        done, trunc = False, False
+        while not (done or trunc):
+
+            action = model.predict(obs)
+            obs, reward, done, trunc, info = env.step(action)
+
+            epoch_reward += reward
+
+        rewards.append(epoch_reward)
+
+    print()
+    print('Mean score: {} (+/- {})'.format(
+        np.round(np.mean(rewards), 2),
+        np.round(np.std(rewards), 2)
+    ))
+    print('(perfect score is 500)')
+    print()
diff --git a/Demos/Stratego/cartpole/reinforcement-learning-cartpole-v0.gif b/Demos/Stratego/cartpole/reinforcement-learning-cartpole-v0.gif
new file mode 100644
index 0000000..44e99c4
Binary files /dev/null and b/Demos/Stratego/cartpole/reinforcement-learning-cartpole-v0.gif differ
diff --git a/Demos/Stratego/cartpole/requirements.txt b/Demos/Stratego/cartpole/requirements.txt
new file mode 100644
index 0000000..90e3503
--- /dev/null
+++ b/Demos/Stratego/cartpole/requirements.txt
@@ -0,0 +1,2 @@
+gymnasium==0.28.1
+stratetrees==0.0.1