A quick attempt to utilize Stable Diffusion (SD) generative models (v1.5 & XL) and MediaPipe's Pose Landmarker vision models to generate 3D human poses from an AI prompt.
Generating a pose with a prompt using SD v1.5 |
- torch
- diffusers
- huggingface-hub
- mediapipe
- pyside6 (>=6.7)
- Stable Diffusion v1.5 has fewer model parameters and therefore requires less memory whereas Stable Diffusion XL generates better-quality images with more realistic human poses.
- Hyper-SD (a SD inference acceleration algorithm) Steps: 2, 4, or 8, trade-off between speed (fewer bigger steps) and quality (more smaller steps).
- PyTorch device for running SD inference (if available): CPU, CUDA, or MPS (Apple Silicon).