MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices
Author's implementation of the paper MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices. This work was published at UIST'24.
We recommend configuring the project inside an Anaconda environment. We have tested everything using Anaconda version 23.9.0 and Python 3.9. The first step is to create a virtual environment, as shown below (named mobileposer
).
conda create -n mobileposer python=3.9
You should then activate the environment as shown below. All following operations must be completed within the virtual environment.
conda activate mobileposer
Then, install the required packages.
pip install -r requirements.txt
You will then need to install the local mobileposer package for development via the command below. You must run this from the root directory (e.g., where setup.py is).
pip install -e .
- Register and download the AMASS dataset from here. We use 'SMPLH+G' for each dataset.
- Register and download the DIP-IMU dataset from here. Download the raw (unormalized) data.
- Request access to the TotalCapture dataset here.
- Download the IMUPoser dataset from here.
Once downloaded, your directory might appear as follows:
data
└── raw
├── AMASS
│ ├── ACCAD
│ ├── BioMotionLab_NTroje
│ ├── BMLhandball
│ ├── ...
│ └── Transitions_mocap
├── DIP_IMU
│ ├── s_01
│ ├── s_02
│ ├── s_03
│ ├── ...
│ └── s_10
├── IMUPoser
│ ├── P1
│ ├── P2
│ ├── P3
│ ├── ...
│ └── P10
In config.py
:
- Set
paths.processed_datasets
to the directory containing the pre-processed datasets. - Set
paths.raw_amass
to the directory containing the AMASS dataset. - Set
paths.raw_dip
to the directory containing the DIP dataset. - Set
paths.raw_imuposer
to the directory containing the IMUPoser dataset.
The script process.py
drives the dataset pre-processing. This script takes the following parameters:
--dataset
: Dataset to pre-process (amass
,dip
,imuposer
). Defaults toamass
.
As an example, the following command will pre-process the DIP dataset.
$ python process.py --dataset dip
The script train.py
drives the training process. This script takes the following parameters:
--module
: Train an individual module (poser
,joints
,foot_contact
,velocity
). Default to training all modules.--init-from
: Initialize training from an existing checkpoint. Defaults to training from scratch.--finetune
: Specify dataset for finetuning module (e.g.,dip
).--fast-dev-run
: A boolean flag that caps the execution to a single epoch. This flag is useful for debugging.
As an example, we can execute the following command to train all modules:
$ python train.py
To facilitate finetuning MobilePoser, we provide a script finetune.sh
. To run this script, use the following syntax:
$ ./finetune.sh <dataset-name> <checkpoint-directory>
The script combine_weights.py
combines the weights of individual modules into a single weight file that can be loaded into MobilePoserNet
.
To run this script, use the following syntax:
$ python combine_weights.py --finetune <dataset-name> --checkpoint <checkpoint-directory>
Omit the --finetune
argument if you did not finetune. The resulting weight file will be stored under the same directory as the checkpoint-directory>
We provide a pre-trained model for the set of configurations listed in config.py
.
- Download weights from here.
- In
config.py
, set thepaths.weights_file
to the model path.
The script evaluate.py
drives model evaluation. This script takes the following arguments.
--model
: Path to the trained model.--dataset
: Dataset to execute testing on (e.g.,dip
,imuposer
,totalcapture
).
As an example, we can execute the following concrete command:
$ python evaluate.py --model checkpoints/weights.pth --dataset dip
To visualize the prediction results of the trained model, we provide a script example.py
. This script takes the following arguments.
--model
: Path to the trained model.--dataset
: Dataset to execute prediction for visualization. Defaults todip
.--seq-num
: Sequence nuber of dataset to execute prediction. Defaults to 1.--with-tran
: A boolean flag to enable visualizing translation estimation. Defaults to False.--combo
: Device-location combination. Defaults to 'lw_rp' (left-wrist right-pocket).
Additionally, you can set the GT environment variable to customize visualization modes:
- GT=1: Visualizes both predictions and ground-truth.
- GT=2: Visualizes only the ground-truth data.
As an example, we can execute the following concrete command:
$ GT=1 python example.py --model checkpoints/weights.pth --dataset dip --seq-num 5 --with-tran
Note, we recommend using your local machine to visualize the results.
@inproceedings{xu2024mobileposer,
title={MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices},
author={Xu, Vasco and Gao, Chenfeng and Hoffmann, Henry and Ahuja, Karan},
booktitle={Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology},
pages={1--11},
year={2024}
}
For questions, please contact nu.spicelab@gmail.com.
We would like to thank the following projects for great prior work that inspired us: TransPose, PIP, IMUPoser.
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. For commercial use, a separate commercial license is required. Please contact kahuja@northwestern.edu at Northwestern University for licensing inquiries.