The "Pixel Shuffling Control Using Hand Tracking" project combines computer vision techniques with hand tracking to create an interactive image processing experience. The goal of this project is to allow users to control the degree of randomness in pixel shuffling effects using their hand position. By leveraging the power of YOLOv8m-segmentation for background removal and Mediapipe for hand tracking, the application provides real-time feedback and visual effects on a webcam feed.
The image processing pipeline consists of multiple stages. Firstly, background removal is performed using YOLOv8-segmentation to separate the foreground from the background. This ensures that the pixel shuffling effects only affect the foreground objects, creating a visually appealing and immersive experience. Next, hand tracking using Mediapipe allows the application to detect and track the user's hand movements in real-time. The position of the hand is then used to control the degree of randomness in the pixel shuffling effects. Additionally, the application includes image pixelation functionality, where the code provided by ChatGPT has been modified to achieve the desired pixelation effect. Lastly, the pixel shuffling code created by ChatGPT is incorporated, providing a captivating visual transformation of the image. By combining these techniques, users can explore different hand gestures and positions to dynamically control the randomness and visual outcome of the pixel shuffling effect.
git clone
cd pixel-shuffle-using-hand-tracking
pip install -r requirements.txt
If not provided model weights will be downloaded automatically.
python --model-type m --webcam-number 0
- model-type Choose between "n" for nano, "s" for small, "m" for medium, "l" for large and "x" for x-large
- webcam-number based on the total webcams connected to your computer (if only one webcam is connected choose 0)
- Press "q" to quit
See YOLOv8 GitHub page for more info. See Segmentation Docs for usage examples with these models.
Model | size (pixels) |
mAPbox 50-95 |
mAPmask 50-95 |
Speed CPU ONNX (ms) |
Speed A100 TensorRT (ms) |
params (M) |
FLOPs (B) |
YOLOv8n | 640 | 36.7 | 30.5 | 96.1 | 1.21 | 3.4 | 12.6 |
YOLOv8s | 640 | 44.6 | 36.8 | 155.7 | 1.47 | 11.8 | 42.6 |
YOLOv8m | 640 | 49.9 | 40.8 | 317.0 | 2.18 | 27.3 | 110.2 |
YOLOv8l | 640 | 52.3 | 42.6 | 572.4 | 2.79 | 46.0 | 220.5 |
YOLOv8x | 640 | 53.4 | 43.4 | 712.1 | 4.02 | 71.8 | 344.1 |
- mAPval values are for single-model single-scale on COCO val2017 dataset.
Reproduce byyolo val segment data=coco.yaml device=0
- Speed averaged over COCO val images using an Amazon EC2 P4d
Reproduce byyolo val segment data=coco128-seg.yaml batch=1 device=0/cpu
- Connect a webcam to your computer.
- Run the application by executing the script.
- The webcam feed will open in a window, and the image processing effects will be applied in real-time.
- Press 'q' to exit the application.
This project is licensed under the MIT License. See the LICENSE file for details.