Skip to content

Latest commit

 

History

History
55 lines (43 loc) · 2.81 KB

TrainingPolicies.md

File metadata and controls

55 lines (43 loc) · 2.81 KB

Solve the tasks

Interacting with the MyoChallenge environemtns

The challenges will consist of two Phases, to which each a myosuite environemnt is associated to:

Phase 1 - Die Environemnt: myoChallengeDieReorientP1-v0

Phase 1 - Boading Ball Environemnt: myoChallengeBaodingP1-v1

Phase 2 - Die Environemnt: myoChallengeDieReorientP2-v0

Phase 2 - Boading Ball Environemnt: myoChallengeBaodingP2-v1

To interact with the environements, you will need to install myosuite following those instructions:

conda create --name myochallenge python=3.7.1
conda activate myochallenge
git clone https://github.com/facebookresearch/myosuite.git
cd myosuite
pip install -e .
export PYTHONPATH="./myosuite/:$PYTHONPATH"

It is possible to interact with the environemtn via a gym API:

import myosuite
import gym
env = gym.make('myoChallengeDieReorientP1-v0')
env.reset()
for _ in range(1000):
  env.sim.render(mode='window')
  env.step(env.action_space.sample()) # take a random action
env.close()

Training policies (STEP 1)

It is possible to interface the MyoChallenge environments with any machine learning framework compatible with the gym API.

Customize Agent Script

Once the policy is trained, it is needed to be loaded in the specific agent to be submitted in the evaluation system (STEP 2).

Submit for evaluation and ranking on the evalAI dashboard

Follow Step 3 and Step 4 for respectively building the docker container with the agent and uploading it to the evalAI evaluation system.

Evaluation Criteria

The Dieve and Baoding ball environemnt in the two phases will have different goals, initializations, and evaluation metrices. Below a summary.

teaser results

Evaluation Criteria for the Die and Baoding ball environment in the two phase.

TIP: Personalization of Reward

One method to improve the performance is the reward shaping. It is possible to customize the reward function by modifing the get_reward function in the specific environment e.g. for the die environment or for the baoding ball