Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Tau-J committed Jul 28, 2023
1 parent 3c031ad commit a89100f
Show file tree
Hide file tree
Showing 3 changed files with 449 additions and 1 deletion.
224 changes: 224 additions & 0 deletions docs/en/advanced_guides/codecs.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ Here is a diagram to show where the `Codec` is:

![pose_estimator_en](https://github.com/open-mmlab/mmpose/assets/13503330/0764baab-41c7-4a1d-ab64-5d7f9dfc8eec)

## Basic Concepts

A typical codec consists of two parts:

- Encoder
Expand Down Expand Up @@ -225,3 +227,225 @@ test_pipeline = [
dict(type='PackPoseInputs')
]
```

## Supported Codecs

Supported codecs are in [$MMPOSE/mmpose/codecs/](https://github.com/open-mmlab/mmpose/tree/dev-1.x/mmpose/codecs). Here is a list:

- [RegressionLabel](#RegressionLabel)
- [IntegralRegressionLabel](#IntegralRegressionLabel)
- [MSRAHeatmap](#MSRAHeatmap)
- [UDPHeatmap](#UDPHeatmap)
- [MegviiHeatmap](#MegviiHeatmap)
- [SPR](#SPR)
- [SimCC](#SimCC)
- [DecoupledHeatmap](#DecoupledHeatmap)
- [ImagePoseLifting](#ImagePoseLifting)
- [VideoPoseLifting](#VideoPoseLifting)
- [MotionBERTLabel](#MotionBERTLabel)

### RegressionLabel

[\[Github\]](https://github.com/open-mmlab/mmpose/blob/dev-1.x/mmpose/codecs/regression_label.py#L12)

The `RegressionLabel` codec is used to generate normalized coordinates as the regression targets.

**Input**

- Encoding keypoints from input image space to normalized space.

**Output**

- Decoding normalized coordinates from normalized space to input image space.

Related works:

- [DeepPose](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#deeppose-cvpr-2014)
- [RLE](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#rle-iccv-2021)

### IntegralRegressionLabel

[\[Github\]](https://github.com/open-mmlab/mmpose/blob/dev-1.x/mmpose/codecs/integral_regression_label.py)

The `IntegralRegressionLabel` codec is used to generate normalized coordinates as the regression targets.

**Input**

- Encoding keypoints from input image space to normalized space, and generate Gaussian heatmaps as well.

**Output**

- Decoding normalized coordinates from normalized space to input image space.

Related works:

- [IPR](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#ipr-eccv-2018)
- [DSNT](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#dsnt-2018)
- [Debias IPR](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#debias-ipr-iccv-2021)

### MSRAHeatmap

[\[Github\]](https://github.com/open-mmlab/mmpose/blob/dev-1.x/mmpose/codecs/msra_heatmap.py)

The `MSRAHeatmap` codec is used to generate Gaussian heatmaps as the targets.

**Input**

- Encoding keypoints from input image space to output space as 2D Gaussian heatmaps.

**Output**

- Decoding 2D Gaussian heatmaps from output space to input image space as coordinates.

Related works:

- [SimpleBaseline2D](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#simplebaseline2d-eccv-2018)
- [CPM](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#cpm-cvpr-2016)
- [HRNet](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#hrnet-cvpr-2019)
- [DARK](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#darkpose-cvpr-2020)

### UDPHeatmap

[\[Github\]](https://github.com/open-mmlab/mmpose/blob/dev-1.x/mmpose/codecs/udp_heatmap.py)

The `UDPHeatmap` codec is used to generate Gaussian heatmaps as the targets.

**Input**

- Encoding keypoints from input image space to output space as 2D Gaussian heatmaps.

**Output**

- Decoding 2D Gaussian heatmaps from output space to input image space as coordinates.

Related works:

- [UDP](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#udp-cvpr-2020)

### MegviiHeatmap

[\[Github\]](https://github.com/open-mmlab/mmpose/blob/dev-1.x/mmpose/codecs/megvii_heatmap.py)

The `MegviiHeatmap` codec is used to generate Gaussian heatmaps as the targets, which is usually used in Megvii's works.

**Input**

- Encoding keypoints from input image space to output space as 2D Gaussian heatmaps.

**Output**

- Decoding 2D Gaussian heatmaps from output space to input image space as coordinates.

Related works:

- [MSPN](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#mspn-arxiv-2019)
- [RSN](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#rsn-eccv-2020)

### SPR

[\[Github\]](https://github.com/open-mmlab/mmpose/blob/dev-1.x/mmpose/codecs/spr.py)

The `SPR` codec is used to generate Gaussian heatmaps of instances' center, and offsets as the targets.

**Input**

- Encoding keypoints from input image space to output space as 2D Gaussian heatmaps and offsets.

**Output**

- Decoding 2D Gaussian heatmaps and offsets from output space to input image space as coordinates.

Related works:

- [DEKR](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#dekr-cvpr-2021)

### SimCC

[\[Github\]](https://github.com/open-mmlab/mmpose/blob/dev-1.x/mmpose/codecs/simcc_label.py)

The `SimCC` codec is used to generate 1D Gaussian representations as the targets.

**Input**

- Encoding keypoints from input image space to output space as 1D Gaussian representations.

**Output**

- Decoding 1D Gaussian representations from output space to input image space as coordinates.

Related works:

- [SimCC](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#simcc-eccv-2022)
- [RTMPose](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#rtmpose-arxiv-2023)

### DecoupledHeatmap

[\[Github\]](https://github.com/open-mmlab/mmpose/blob/dev-1.x/mmpose/codecs/decoupled_heatmap.py)

The `DecoupledHeatmap` codec is used to generate Gaussian heatmaps as the targets.

**Input**

- Encoding human center points and keypoints from input image space to output space as 2D Gaussian heatmaps.

**Output**

- Decoding 2D Gaussian heatmaps from output space to input image space as coordinates.

Related works:

- [CID](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#cid-cvpr-2022)

### ImagePoseLifting

[\[Github\]](https://github.com/open-mmlab/mmpose/blob/dev-1.x/mmpose/codecs/image_pose_lifting.py)

The `ImagePoseLifting` codec is used for image 2D-to-3D pose lifting.

**Input**

- Encoding 2d keypoints from input image space to normalized 3d space.

**Output**

- Decoding 3d keypoints from normalized space to input image space.

Related works:

- [SimpleBaseline3D](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#simplebaseline3d-iccv-2017)

### VideoPoseLifting

[\[Github\]](https://github.com/open-mmlab/mmpose/blob/dev-1.x/mmpose/codecs/video_pose_lifting.py)

The `VideoPoseLifting` codec is used for video 2D-to-3D pose lifting.

**Input**

- Encoding 2d keypoints from input image space to normalized 3d space.

**Output**

- Decoding 3d keypoints from normalized space to input image space.

Related works:

- [VideoPose3D](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo_papers/algorithms.html#videopose3d-cvpr-2019)

### MotionBERTLabel

[\[Github\]](https://github.com/open-mmlab/mmpose/blob/dev-1.x/mmpose/codecs/motionbert_label.py)

The `MotionBERTLabel` codec is used for video 2D-to-3D pose lifting.

**Input**

- Encoding 2d keypoints from input image space to normalized 3d space.

**Output**

- Decoding 3d keypoints from normalized space to input image space.

Related works:

- [MotionBERT](https://mmpose.readthedocs.io/zh_CN/dev-1.x/model_zoo/body_3d_keypoint.html#pose-lift-motionbert-on-h36m)
Loading

0 comments on commit a89100f

Please sign in to comment.