Skip to content

Latest commit

 

History

History
490 lines (360 loc) · 13.2 KB

File metadata and controls

490 lines (360 loc) · 13.2 KB

VirtualPainter API Documentation

Overview

VirtualPainter is a computer vision application that allows users to draw on their screen using hand gestures captured by a webcam. The project consists of two main modules:

  • app.py - Main application with virtual painting interface
  • detectlib.py - Detection library with hand, face, and pose detection capabilities

Table of Contents

  1. Main Application (app.py)
  2. Detection Library (detectlib.py)
  3. Hand Landmarks Constants
  4. Usage Examples

Main Application (app.py)

The main application creates a virtual painting interface where users can draw using hand gestures.

Global Configuration Variables

Variable Type Default Description
cam_width int 1280 Camera capture width in pixels
cam_height int 720 Camera capture height in pixels
brush_color tuple (0, 0, 255) Current brush color in BGR format (red by default)
brush_thickness int 20 Thickness of the drawing brush
eraser_thickness int 150 Thickness of the eraser

Key Features

  • Real-time hand tracking using MediaPipe
  • Gesture-based drawing with index finger
  • Color selection using two-finger gesture on header
  • Eraser functionality with black color selection
  • Canvas overlay system for persistent drawing

Drawing Modes

  1. Selection Mode: Two fingers up (index + middle)

    • Navigate header to select colors/tools
    • Color options: Red, Green, Blue, Magenta, Black (eraser)
  2. Drawing Mode: Index finger up only

    • Draw on canvas with selected color and thickness
    • Automatic line drawing between finger positions

Detection Library (detectlib.py)

Hand Landmarks Constants

WRIST = 0
THUMB_CMC = 1
THUMB_MCP = 2
THUMB_IP = 3
THUMB_TIP = 4
INDEX_FINGER_MCP = 5
INDEX_FINGER_PIP = 6
INDEX_FINGER_DIP = 7
INDEX_FINGER_TIP = 8
MIDDLE_FINGER_MCP = 9
MIDDLE_FINGER_PIP = 10
MIDDLE_FINGER_DIP = 11
MIDDLE_FINGER_TIP = 12
RING_FINGER_MCP = 13
RING_FINGER_PIP = 14
RING_FINGER_DIP = 15
RING_FINGER_TIP = 16
PINKY_FINGER_MCP = 17
PINKY_FINGER_PIP = 18
PINKY_FINGER_DIP = 19
PINKY_FINGER_TIP = 20

HandDetector Class

Constructor

HandDetector(static_image_mode=False, max_num_hands=2, model_complexity=1, 
             min_detection_confidence=0.5, min_tracking_confidence=0.5)

Parameters:

  • static_image_mode (bool): Whether to treat input as static images
  • max_num_hands (int): Maximum number of hands to detect
  • model_complexity (int): Model complexity (0, 1, or 2)
  • min_detection_confidence (float): Minimum confidence for hand detection
  • min_tracking_confidence (float): Minimum confidence for hand tracking

Methods

detect_hands(img, show_hlabel=True, draw_landmarks=False, draw_rect=True, angle_thickness=5, angle_length=20, draw_color=(0, 255, 0), draw_thickness=1)

Detects hands in the input image and optionally draws bounding boxes and landmarks.

Parameters:

  • img (numpy.ndarray): Input image
  • show_hlabel (bool): Show hand labels
  • draw_landmarks (bool): Draw hand landmarks
  • draw_rect (bool): Draw bounding rectangles
  • angle_thickness (int): Thickness of corner angles
  • angle_length (int): Length of corner angles
  • draw_color (tuple): Color for drawing (BGR format)
  • draw_thickness (int): Thickness of drawn lines

Returns:

  • numpy.ndarray: Image with drawn detections

hands_count()

Returns the number of detected hands.

Returns:

  • int: Number of detected hands (0 if none detected)

find_position(img, hand_index=0, draw=True, draw_color=(255, 0, 0), draw_thickness=15, draw_size=15)

Finds and returns landmark positions for a specific hand.

Parameters:

  • img (numpy.ndarray): Input image
  • hand_index (int): Index of the hand to analyze
  • draw (bool): Whether to draw landmarks on image
  • draw_color (tuple): Color for drawing landmarks
  • draw_thickness (int): Thickness of drawn circles
  • draw_size (int): Size of drawn circles

Returns:

  • list: List of landmarks in format [landmark_id, x, y]

detected_hands_count()

Returns the count of detected hands (alternative to hands_count()).

Returns:

  • int or None: Number of detected hands

draw_circle(img, landmark_index, hand_index=0, color=(0, 255, 0), thickness=cv2.FILLED, size=15)

Draws a circle at a specific landmark position.

Parameters:

  • img (numpy.ndarray): Input image
  • landmark_index (int): Index of the landmark to mark
  • hand_index (int): Index of the hand
  • color (tuple): Circle color (BGR format)
  • thickness (int): Circle thickness
  • size (int): Circle size

fingers_up()

Determines which fingers are extended upward.

Returns:

  • list: Array of 5 boolean values [thumb, index, middle, ring, pinky]
    • 1 = finger is up
    • 0 = finger is down

FaceDetector Class

Constructor

FaceDetector(min_detection_confidence=0.5, model_selection=0)

Parameters:

  • min_detection_confidence (float): Minimum confidence for face detection
  • model_selection (int): Model selection (0 for close-range, 1 for full-range)

Methods

detect_faces(img, draw=True)

Detects faces in the input image.

Parameters:

  • img (numpy.ndarray): Input image
  • draw (bool): Whether to draw detection results

Returns:

  • tuple: (processed_image, bounding_boxes_list)
    • processed_image (numpy.ndarray): Image with drawn detections
    • bounding_boxes_list (list): List of [detection_id, bbox, confidence_score]

fancy_draw(img, bbox, draw_color=(0, 255, 0), rect_thickness=1, angle_size=30, angle_thickness=5) [Static Method]

Draws a stylized bounding box with corner angles.

Parameters:

  • img (numpy.ndarray): Input image
  • bbox (tuple): Bounding box coordinates (x, y, width, height)
  • draw_color (tuple): Drawing color (BGR format)
  • rect_thickness (int): Rectangle line thickness
  • angle_size (int): Size of corner angles
  • angle_thickness (int): Thickness of corner angles

Returns:

  • numpy.ndarray: Image with drawn bounding box

PoseDetector Class

Constructor

PoseDetector(static_image_mode=False, model_complexity=1, smooth_landmarks=True,
             enable_segmentation=False, smooth_segmentation=True,
             min_detection_confidence=0.5, min_tracking_confidence=0.5)

Parameters:

  • static_image_mode (bool): Whether to treat input as static images
  • model_complexity (int): Model complexity (0, 1, or 2)
  • smooth_landmarks (bool): Whether to smooth landmarks
  • enable_segmentation (bool): Whether to enable pose segmentation
  • smooth_segmentation (bool): Whether to smooth segmentation
  • min_detection_confidence (float): Minimum confidence for pose detection
  • min_tracking_confidence (float): Minimum confidence for pose tracking

Methods

detect_landmarks(img, draw=True)

Detects pose landmarks in the input image.

Parameters:

  • img (numpy.ndarray): Input image
  • draw (bool): Whether to draw pose landmarks and connections

Returns:

  • numpy.ndarray: Image with drawn pose landmarks

Usage Examples

Complete VirtualPainter Setup

import cv2
import numpy as np
import detectlib as dlib

# Initialize camera and detector
cap = cv2.VideoCapture(0)
cap.set(3, 1280)  # Width
cap.set(4, 720)   # Height

detector = dlib.HandDetector(min_detection_confidence=0.5)
canvas = np.zeros((720, 1280, 3), np.uint8)

# Drawing parameters
brush_color = (0, 0, 255)  # Red
brush_thickness = 5
prev_x, prev_y = 0, 0

while True:
    ret, img = cap.read()
    img = cv2.flip(img, 1)
    
    # Detect hands
    img = detector.detect_hands(img, draw_color=brush_color)
    landmarks = detector.find_position(img, draw=False)
    
    if len(landmarks) != 0:
        # Get finger positions
        x1, y1 = landmarks[dlib.INDEX_FINGER_TIP][1:]
        x2, y2 = landmarks[dlib.MIDDLE_FINGER_TIP][1:]
        
        # Check finger states
        fingers = detector.fingers_up()
        
        # Drawing mode (index finger only)
        if fingers[1] and not fingers[2]:
            if prev_x == 0 and prev_y == 0:
                prev_x, prev_y = x1, y1
            
            cv2.line(canvas, (prev_x, prev_y), (x1, y1), brush_color, brush_thickness)
            prev_x, prev_y = x1, y1
        else:
            prev_x, prev_y = 0, 0
    
    # Combine camera feed with canvas
    img = cv2.bitwise_or(img, canvas)
    cv2.imshow('Virtual Painter', img)
    
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Multi-Modal Detection

import cv2
import detectlib as dlib

# Initialize all detectors
hand_detector = dlib.HandDetector()
face_detector = dlib.FaceDetector()
pose_detector = dlib.PoseDetector()

cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    if ret:
        # Detect hands, faces, and pose
        frame = hand_detector.detect_hands(frame, draw_color=(0, 255, 0))
        frame, faces = face_detector.detect_faces(frame)
        frame = pose_detector.detect_landmarks(frame)
        
        # Display counts
        hand_count = hand_detector.hands_count()
        face_count = len(faces)
        
        cv2.putText(frame, f'Hands: {hand_count}', (10, 30), 
                   cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2)
        cv2.putText(frame, f'Faces: {face_count}', (10, 70), 
                   cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2)
        
        cv2.imshow('Multi-Modal Detection', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

cap.release()
cv2.destroyAllWindows()

Gesture Recognition

import cv2
import detectlib as dlib

detector = dlib.HandDetector()
cap = cv2.VideoCapture(0)

def recognize_gesture(fingers):
    """Recognize hand gestures based on finger positions"""
    if fingers == [0, 1, 0, 0, 0]:
        return "Pointing"
    elif fingers == [0, 1, 1, 0, 0]:
        return "Peace Sign"
    elif fingers == [1, 1, 1, 1, 1]:
        return "Open Hand"
    elif fingers == [0, 0, 0, 0, 0]:
        return "Fist"
    elif fingers == [1, 0, 0, 0, 0]:
        return "Thumbs Up"
    else:
        return "Unknown"

while True:
    ret, frame = cap.read()
    if ret:
        frame = detector.detect_hands(frame)
        landmarks = detector.find_position(frame, draw=False)
        
        if len(landmarks) != 0:
            fingers = detector.fingers_up()
            gesture = recognize_gesture(fingers)
            
            cv2.putText(frame, f'Gesture: {gesture}', (10, 50), 
                       cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
        
        cv2.imshow('Gesture Recognition', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

cap.release()
cv2.destroyAllWindows()

Hand Tracking Example

import cv2
import detectlib as dlib

detector = dlib.HandDetector(min_detection_confidence=0.7)
cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    if ret:
        frame = detector.detect_hands(frame, draw_rect=True, draw_color=(0, 255, 0))
        landmarks = detector.find_position(frame, draw=False)
        
        if len(landmarks) != 0:
            # Get index finger tip position
            x, y = landmarks[dlib.INDEX_FINGER_TIP][1:]
            print(f"Index finger tip at: ({x}, {y})")
            
            # Draw circle at index finger tip
            detector.draw_circle(frame, dlib.INDEX_FINGER_TIP, color=(0, 0, 255))
        
        cv2.imshow('Hand Detection', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

cap.release()
cv2.destroyAllWindows()

Face Detection Example

import cv2
import detectlib as dlib

face_detector = dlib.FaceDetector(min_detection_confidence=0.7)
cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    if ret:
        frame, faces = face_detector.detect_faces(frame)
        print(f"Detected {len(faces)} faces")
        
        cv2.imshow('Face Detection', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

cap.release()
cv2.destroyAllWindows()

Pose Detection Example

import cv2
import detectlib as dlib

pose_detector = dlib.PoseDetector(min_detection_confidence=0.6)
cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    if ret:
        frame = pose_detector.detect_landmarks(frame, draw=True)
        cv2.imshow('Pose Detection', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

cap.release()
cv2.destroyAllWindows()

Dependencies

The project requires the following Python packages:

  • opencv-python (cv2)
  • mediapipe
  • numpy
  • requests (for app.py)

Installation

pip install opencv-python mediapipe numpy requests

Notes

  • All color values use BGR format (Blue, Green, Red)
  • Coordinate system origin (0,0) is at the top-left corner
  • Hand landmarks are numbered 0-20 following MediaPipe convention
  • The application is optimized for real-time performance with webcam input
  • For best results, ensure good lighting and clear hand visibility