Skip to content

Commit

Permalink
Add tasks descriptions from Yu et al.
Browse files Browse the repository at this point in the history
- update benchmark descriptions
  • Loading branch information
frankroeder committed Sep 6, 2024
1 parent 9c8a992 commit ed3ea35
Show file tree
Hide file tree
Showing 3 changed files with 193 additions and 19 deletions.
55 changes: 36 additions & 19 deletions docs/benchmark/benchmark_descriptions.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Below, different levels of difficulty are described.

### Multi-Task (MT1)

In the easiest setting, **MT1**, a single task needs to be learned where the agent must *reach*, *push*, or *pick and place* a goal object.
In the easiest setting, **MT1**, a single task needs to be learned where the agent must, e.g, *reach*, *push*, or *pick and place* a goal object.
There is no testing of generalization involved in this setting.

```{figure} _static/mt1.gif
Expand All @@ -27,10 +27,11 @@ There is no testing of generalization involved in this setting.

### Multi-Task (MT10)

The **MT10** setting involves learning to solve a diverse set of 10 tasks, as depicted below.
There is no testing of generalization involved in this setting.


The **MT10** evaluation uses 10 tasks: *reach*, *push*, *pick and place*,
*open door*, *open drawer*, *close drawer*, *press button top-down*,
*insert peg side*, *open window*, and *open box*. The policy is provided with a
one-hot vector indicating the current task. The positions of objects and goal
positions are fixed in all tasks to focus solely on the skill acquisition.

```{figure} _static/mt10.gif
:alt: Multi-Task 10
Expand All @@ -39,32 +40,44 @@ There is no testing of generalization involved in this setting.

### Multi-Task (MT50)

In the **MT50** setting, the agent is challenged to solve the full suite of 50 tasks contained in metaworld.
This is the most challenging multi-task setting and involves no evaluation on test tasks.
The **MT50** evaluation uses all 50 Meta-World tasks. This is the most
challenging multi-task setting and involves no evaluation on test tasks.
As with **MT10**, the policy is provided with a one-hot vector indicating
the current task, and object and goal positions are fixed.

See [Task Descriptions](#benchmark/task_descriptions) for more details.

## Meta-Learning Problems

Meta-RL attempts to evaluate the [transfer learning](https://en.
wikipedia.org/wiki/Transfer_learning) capabilities of agents learning skills based on a predefined set of training tasks, by evaluating generalization using a hold-out set of test tasks.
In other words, this setting allows for benchmarking an algorithm's ability to adapt to or learn new tasks.
Meta-RL attempts to evaluate the [transfer learning](https://en.wikipedia.org/wiki/Transfer_learning)
capabilities of agents learning skills based on a predefined set of training
tasks, by evaluating generalization using a hold-out set of test tasks.
In other words, this setting allows for benchmarking an algorithm's
ability to adapt to or learn new tasks.

### Meta-RL (ML1)

The simplest meta-RL setting, **ML1**, involves a single manipulation task, such as *pick and place* of an object with a changing goal location.
For the test evaluation, unseen goal locations are used to measure generalization capabilities.


The simplest meta-RL setting, **ML1**, involves few-shot adaptation to goal
variation within one task. ML1 uses single Meta-World Tasks, with the
meta-training "tasks" corresponding to 50 random initial object and goal
positions, and meta-testing on 10 held-out positions. We evaluate algorithms
on three individual tasks from Meta-World: *reaching*, *pushing*, and *pick and
place*, where the variation is over reaching position or goal object position.
The goal positions are not provided in the observation, forcing meta-RL
algorithms to adapt to the goal through trial-and-error.

```{figure} _static/ml1.gif
:alt: Meta-RL 1
:width: 500
```


### Meta-RL (ML10)

The meta-learning setting with 10 tasks, **ML10**, involves training on 10 manipulation tasks and evaluating on 5 unseen tasks during the test phase.
The **ML10** evaluation involves few-shot adaptation to new test tasks with 10
meta-training tasks. We hold out 5 tasks and meta-train policies on 10 tasks.
We randomize object and goal positions and intentionally select training tasks
with structural similarity to the test tasks. Task IDs are not provided as
input, requiring a meta-RL algorithm to identify the tasks from experience.

```{figure} _static/ml10.gif
:alt: Meta-RL 10
Expand All @@ -73,10 +86,14 @@ The meta-learning setting with 10 tasks, **ML10**, involves training on 10 manip

### Meta-RL (ML45)

The most difficult environment setting of metaworld, **ML45**, challenges the agent to be trained on 45 distinct manipulation tasks and evaluated on 5 test tasks.

The most difficult environment setting of Meta-World, **ML45**, challenges the
agent with few-shot adaptation to new test tasks using 45 meta-training tasks.
Similar to ML10, we hold out 5 tasks for testing and meta-train policies on 45
tasks. Object and goal positions are randomized, and training tasks are
selected for structural similarity to test tasks. As with ML10, task IDs are
not provided, requiring the meta-RL algorithm to identify tasks from experience.

```{figure} _static/ml45.gif
:alt: Meta-RL 10
:alt: Meta-RL 45
:width: 500
```
156 changes: 156 additions & 0 deletions docs/benchmark/task_descriptions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
---
layout: "contents"
title: Task Descriptions
firstpage:
---

# Task Descriptions
## Turn on faucet
Rotate the faucet counter-clockwise. Randomize faucet positions

## Sweep
Sweep a puck off the table. Randomize puck positions

## Assemble nut
Pick up a nut and place it onto a peg. Randomize nut and peg positions

## Turn off faucet
Rotate the faucet clockwise. Randomize faucet positions

## Push
Push the puck to a goal. Randomize puck and goal positions

## Pull lever
Pull a lever down 90 degrees. Randomize lever positions

## Turn dial
Rotate a dial 180 degrees. Randomize dial positions

## Push with stick
Grasp a stick and push a box using the stick. Randomize stick positions.

## Get coffee
Push a button on the coffee machine. Randomize the position of the coffee machine

## Pull handle side
Pull a handle up sideways. Randomize the handle positions

## Basketball
Dunk the basketball into the basket. Randomize basketball and basket positions

## Pull with stick
Grasp a stick and pull a box with the stick. Randomize stick positions

## Sweep into hole
Sweep a puck into a hole. Randomize puck positions

## Disassemble nut
Pick a nut out of the a peg. Randomize the nut positions

## Place onto shelf
Pick and place a puck onto a shelf. Randomize puck and shelf positions

## Push mug
Push a mug under a coffee machine. Randomize the mug and the machine positions

## Press handle side
Press a handle down sideways. Randomize the handle positions

## Hammer
Hammer a screw on the wall. Randomize the hammer and the screw positions

## Slide plate
Slide a plate into a cabinet. Randomize the plate and cabinet positions

## Slide plate side
Slide a plate into a cabinet sideways. Randomize the plate and cabinet positions

## Press button wall
Bypass a wall and press a button. Randomize the button positions

## Press handle
Press a handle down. Randomize the handle positions

## Pull handle
Pull a handle up. Randomize the handle positions

## Soccer
Kick a soccer into the goal. Randomize the soccer and goal positions

## Retrieve plate side
Get a plate from the cabinet sideways. Randomize plate and cabinet positions

## Retrieve plate
Get a plate from the cabinet. Randomize plate and cabinet positions

## Close drawer
Push and close a drawer. Randomize the drawer positions

## Press button top
Press a button from the top. Randomize button positions

## Reach
Reach a goal position. Randomize the goal positions

## Press button top wall
Bypass a wall and press a button from the top. Randomize button positions

## Reach with wall
Bypass a wall and reach a goal. Randomize goal positions

## Insert peg side
Insert a peg sideways. Randomize peg and goal positions

## Pull
Pull a puck to a goal. Randomize puck and goal positions

## Push with wall
Bypass a wall and push a puck to a goal. Randomize puck and goal positions

## Pick out of hole
Pick up a puck from a hole. Randomize puck and goal positions

## Pick&place w/ wall
Pick a puck, bypass a wall and place the puck. Randomize puck and goal positions

## Press button
Press a button. Randomize button positions

## Pick&place
Pick and place a puck to a goal. Randomize puck and goal positions

## Pull mug
Pull a mug from a coffee machine. Randomize the mug and the machine positions

## Unplug peg
Unplug a peg sideways. Randomize peg positions

## Close window
Push and close a window. Randomize window positions

## Open window
Push and open a window. Randomize window positions

## Open door
Open a door with a revolving joint. Randomize door positions

## Close door
Close a door with a revolving joint. Randomize door positions

## Open drawer
Open a drawer. Randomize drawer positions

## Insert hand
Insert the gripper into a hole.

## Close box
Grasp the cover and close the box with it. Randomize the cover and box positions

## Lock door
Lock the door by rotating the lock clockwise. Randomize door positions

## Unlock door
Unlock the door by rotating the lock counter-clockwise. Randomize door positions

## Pick bin
Grasp the puck from one bin and place it into another bin. Randomize puck positions
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ usage/basic_usage
benchmark/state_space
benchmark/action_space
benchmark/benchmark_descriptions
benchmark/task_descriptions.md
benchmark/env_tasks_vs_task_init
benchmark/reward_functions
```
Expand Down

0 comments on commit ed3ea35

Please sign in to comment.