Computer Vision is a project that utilizes computer vision techniques to detect and interpret hand gestures for controlling various aspects of a system. This project includes four modules and one application that work together to provide hand gesture-based control.
The following are the four modules included in this project:
The FaceDetectionModule The FaceDetectionModule is responsible for detecting faces in a video stream or image. It utilizes the Mediapipe library to perform face detection using a pre-trained model. The module provides functions to detect face and retrieve the hand landmarks.
The FaceMeshModule is used for recognizing specific face gestures based on the face landmarks detected by the FaceMeshDetector. It implements algorithms or machine learning models to classify the face gestures and provides functions to retrieve the recognized gestures.
The HandTrackingModule tracks the movement of the hand in real-time by continuously detecting and tracking the hand landmarks. It can be used to estimate the hand's position, track gestures over time, or perform more advanced hand motion analysis.
The PoseEstimationModule is responsible for estimating the pose or body posture of a person based on the detected landmarks. It uses the Mediapipe library to detect and track various body landmarks, allowing for applications such as body posture analysis or exercise tracking.
The HandVoumeControler application is an example application that showcases the use of the HandTrackingModule module to control the system's volume using hand gestures. It detects the hand, tracks the hand movement, recognizes specific gestures, and adjusts the system volume accordingly.
The FuckDetector application is annother example application that showcases the use of the HandTrackingModule module to detect fingers and alert if someone shows the camera a middle finger. It detects the hand, tracks the hand movement..
To get started with this project, follow these steps:
-
Clone the repository to your local machine.
git clone https://github.com/drunkleen/ComputerVision
-
Install the required dependencies by running
pip install -r requirements.txt
-
Explore the documentation of Mediapipe and examples provided for each module (will be added soon).
-
Customize and extend the modules or develop your own application based on the modules provided.
The project relies on the following dependencies:
- OpenCV: for capturing and processing video frames.
- Mediapipe: for hand and pose detection, tracking, and landmark estimation.
- NumPy: for numerical computations and array manipulation.
- pycaw: for controlling the system volume on Windows.
Refer to the individual module files for more specific dependencies and installation instructions if needed.
This project is licensed under the MIT License. See the LICENSE file for more details.
Contributions to this project are welcome. Feel free to open issues or submit pull requests to suggest improvements, report bugs, or add new features.
This project makes use of the following open-source libraries and resources:
I express my gratitude to the developers and contributors of these libraries for their valuable work.