Code for reproducing results seen in my bachelor's thesis "Modeling Head-Bobbing in Pigeon Locomotion using Reinforcement Learning" (PDF file). Refer to the thesis for any details regarding the experiments in the repository.
We used the soft actor critic (SAC) implementation in RLkit by vitchyr and a proximal policy optimization (PPO) implementation built on top of RLkit RlkitExtension for training the pigeon models, both of which are included as sub-repositories.
Head-bobbing is a behavior unique to forward locomotion in small birds, mainly pigeons, that consists of a hold phase, where they lock the position of their heads into one position, and a thrust phase, where they move them to a different position. 2 main functionalities of the behavior have been proposed in preliminary research: visual stabilization and induction of motion parallax; however, there is a lack of research that focus on validating their sufficiency by attempting to reproduce it in environments that take physics into account. In our research, we construct a simplified model of pigeons that represent their heads, necks, and bodies and validate the preliminary hypotheses regarding the functionalities of the behavior using reinforcement learning.
We constructed 2 OpenAI Gym environments that utilize the simplified pigeon model
-
PigeonEnv3Joints
(code)- Pigeon model is tasked to move its head to predefined target locations
- Head follows a path that represents head-bobbing behavior
- Reward functions
head_stable_manual_reposition
- Negative distance between the target locations and the position of the head relative to a threshold value
max_offset
.
- Negative distance between the target locations and the position of the head relative to a threshold value
head_stable_manual_reposition_strict_angle
- Stricter version of
head_stable_manual_reposition
- Rewards are only produced when the angle of the head-tilt is within 30 degrees.
- Stricter version of
-
PigeonRetinalEnv
(code)- Reward functions modeled on the 2 functionalities of the head-bobbing behaviors courtesy of the preliminary hypotheses
motion_parallax
- Sum of velocities of external objects within the retina relative to each other
- Represents motion parallax induction during the thrust phase
retinal_stabilization
- Negative sum of velocities of external objects within the retina
- Represents retinal stabilization during the hold phase
fifty_fifty
- Equally-weighted sum of
retinal stabilization
andmotion_parallax
- Equally-weighted sum of
- Reward functions modeled on the 2 functionalities of the head-bobbing behaviors courtesy of the preliminary hypotheses
Clone the repository.
git clone https://github.com/johnlime/pigeon_head_bob.git
Set the current directory to the repository's main directory.
cd pigeon_head_bob
Add the current directory to PYTHONPATH
export PYTHONPATH=$PWD
The following instructions assume that we are training reinforcement learning models in Linux and conducting testing and visualization of them in MacOS.
Run the following command for installing dependencies for Linux.
conda env create -f conda_env/rlkit-manual-env-linux64gpu.yml
Activate the Anaconda environment using the following command.
conda activate rlkit-manual
Run the following command for installing dependencies for the MacOS.
conda env create -f conda_env/pybox2d-rlkit-manual-env-mac.yml
Activate the Anaconda environment using the following command.
conda activate pybox2d-rlkit-manual
Optionally, you can choose to construct an Anaconda environment solely for using the 2 pigeon environments (without the RLkit training).
conda env create -f conda_env/pigeon_minimal_env.yml
conda activate pigeon-env
Random policies can be run on the environments via the following command.
python run/pigeon_run.py -env <ENVIRONMENT>
-env
,--environment
- Name of the environment to run
PigeonEnv3Joints
PigeonRetinalEnv
Run the following command for SAC training.
python run/sac.py -env <ENVIRONMENT> -bs <BODY_SPEED> -rc <REWARD_FUNCTION> -mo <MAX_OFFSET>
-
-env
,--environment
- Name of the environment to run
PigeonEnv3Joints
PigeonRetinalEnv
-
-bs
,--body_speed
- Body speed of the pigeon model
-
-rc
,--reward_code
- Specify reward function associated with the set environment
PigeonEnv3Joints
head_stable_manual_reposition
head_stable_manual_reposition_strict_angle
PigeonRetinalEnv
motion_parallax
retinal_stabilization
fifty_fifty
- Specify reward function associated with the set environment
-
-mo
,--max_offset
- Specify max offset for aligning head to target
- Not necessary for
PigeonRetinalEnv
- Is set to
0.0
by default
-
The resulting data are stored under
src/rlkit_ppo/data/
Run the following command for PPO training.
python run/ppo.py -env <ENVIRONMENT> -bs <BODY_SPEED> -rc <REWARD_FUNCTION> -mo <MAX_OFFSET>
-
-env
,--environment
- Name of the environment to run
PigeonEnv3Joints
-
-bs
,--body_speed
- Body speed of the pigeon model
-
-rc
,--reward_code
- Specify reward function associated with the set environment
head_stable_manual_reposition
head_stable_manual_reposition_strict_angle
- Specify reward function associated with the set environment
-
-mo
,--max_offset
- Specify max offset for aligning head to target
-
The resulting data are stored under
src/rlkit_ppo/data/
Run the following command after training the reinforcement learning policies.
python run/pigeon_run.py -env <ENVIRONMENT> -dir <DIRECTORY_PATH> -bs <BODY_SPEED> -rc <REWARD_FUNCTION> -mo <MAX_OFFSET>
-
-env
,--environment
- Name of the environment to run
PigeonEnv3Joints
PigeonRetinalEnv
-
-dir
,--snapshot_directory
- Path to the snapshot directory is within
src/rlkit_ppo/data/
- Path to the snapshot directory is within
-
-bs
,--body_speed
- Body speed of the pigeon model
-
-rc
,--reward_code
- Specify reward function associated with the set environment
PigeonEnv3Joints
head_stable_manual_reposition
head_stable_manual_reposition_strict_angle
PigeonRetinalEnv
motion_parallax
retinal_stabilization
fifty_fifty
- Specify reward function associated with the set environment
-
-mo
,--max_offset
- Specify max offset for aligning head to target
- Not necessary for
PigeonRetinalEnv
- Is set to
0.0
by default
-
-v
,--video
- Export result to video
Run the following command.
python run/pigeon_headtrack.py -env <ENVIRONMENT> -dir <DIRECTORY_PATH> -bs <BODY_SPEED>
-
-env
,--environment
- Name of the environment to run
PigeonEnv3Joints
PigeonRetinalEnv
-
-dir
,--snapshot_directory
- Path to the snapshot directory within
src/rlkit_ppo/data/
- Path to the snapshot directory within
-
-bs
,--body_speed
- Body speed of the pigeon model
-
The resulting visualization is under
<SNAPSHOT_DIRECTORY>/body_trajectory/