Scuba-diving-gesture-recognition

Scuba diving gestures recognition using Mediapipe, cv2 and PyTorch

Inspiration

I was always been a huge fan of Minority Report and awaited the day when we could use gestures for our day to day use. Then came Pranav Mistry with his Sixth Sense technology which blew my mind. However it was too hardware focussed.

Google came out with Mediapipe in 2019. I had just completed my Scuba certifications in Open and Advanced Open Water scuba diving when I came across some cool animations on Facebook using Mediapipe. A quick search led me to Nicholas Renotte's famous Sign Language video. I was super impressed by the process of webcam images using cv2 and implemented the approach for scuba diving signals (I had just completed by Open Water and Advanced Open water courses then) using Pytorch.

Mediapipe:

Paper - MediaPipe: A Framework for Building Perception Pipelines

Repo - Github repo

Documentation - https://google.github.io/mediapipe/

Objective:

To train a model to capture simple video feed from the webcam and categorize the gestures shown by the user into one of five actions:

Ok
Stop
Descend
Not Ok
Ascend

Key learnings:

Mediapipe landmarks can get significantly impacted by the lighting - both at the time of data collection and at the time of inference.
If you're not careful while consolidating the various frames for your input dataset, the order of labels can get scattered. After completing your one-hot encoding, run a sample check across your classes to determine its index in the encoding
More samples ! I could record only 150 samples across 5 different action classes
Stability over precision. Video processing has this annoying property of rapidly changing classes as frames change. So to avoid this, it takes this model about 3-4 frames before it stabilizes its prediction class. Hence, you may see a bit of jitter in the displayed result but it will immediately stabilize. I had this issue in my [object classification project](https://github.com/SwamiKannan/Formula1-car-detection)

Sample video

_{Gestures will take a second to align to the correct label}

_{Image credit for cover image: Rooster Teeth}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
logs		logs
model		model
renders		renders
.gitattributes		.gitattributes
I. Data capture.ipynb		I. Data capture.ipynb
II. Data processing.ipynb		II. Data processing.ipynb
III. Training.ipynb		III. Training.ipynb
IV. Testing on real-time data.ipynb		IV. Testing on real-time data.ipynb
README.md		README.md
camera_utils.py		camera_utils.py
cover.gif		cover.gif
data_capture.py		data_capture.py
data_processor.py		data_processor.py
license		license
model.py		model.py
scuba_diving.gif		scuba_diving.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scuba-diving-gesture-recognition

Inspiration

Mediapipe:

Paper - MediaPipe: A Framework for Building Perception Pipelines

Repo - Github repo

Documentation - https://google.github.io/mediapipe/

Objective:

Contents:

Key learnings:

Sample video

About

Releases

Packages

Languages

License

SwamiKannan/Scuba-diving-gesture-recognition-using-Mediapipe

Folders and files

Latest commit

History

Repository files navigation

Scuba-diving-gesture-recognition

Inspiration

Mediapipe:

Paper - MediaPipe: A Framework for Building Perception Pipelines

Repo - Github repo

Documentation - https://google.github.io/mediapipe/

Objective:

Contents:

Key learnings:

Sample video

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages