Gesture Control

Real-time webcam computer vision app that tracks faces and hands, draws landmarks, and maps pinch gestures to desktop mouse control.

Vision Stack OpenCV + MediaPipe Tasks
Tracking Multi-face + up to 4 hands
Interaction Cursor move + pinch-to-click
Models YuNet ONNX + hand_landmarker.task

Project Snapshot

Gesture Control processes webcam frames in real time, renders CV overlays, and turns hand gestures into practical desktop input. The app keeps the pipeline modular while running fully in a single Python entrypoint.

Core Components

  • Continuous OpenCV capture loop with safe exit and frame-failure handling
  • YuNet face detector via cv2.FaceDetectorYN with per-face bounding boxes
  • MediaPipe HandLandmarker LIVE_STREAM mode with async callback
  • Landmark-to-pixel utility helpers for clean drawing and geometry logic
  • Index fingertip cursor mapping with mirror behavior for natural control
  • Pinch click detection using adaptive threshold + click cooldown

Tech Stack

Computer Vision

  • Python
  • OpenCV (cv2)
  • MediaPipe Tasks Vision API

Interaction & Assets

  • PyAutoGUI
  • face_detection_yunet_2023mar.onnx
  • hand_landmarker.task

Implementation Highlights

Frame-by-Frame Pipeline

Capture, detect faces, run async hand inference, draw overlays, apply gesture controls, and render output in one tight loop.

Robust Gesture Click Logic

Pinch detection scales with hand size and includes cooldown debounce to reduce accidental repeated clicks.

Usability-Tuned Cursor Mapping

Only cursor motion is mirrored, preserving webcam intuition while keeping landmark calculations stable.

Portfolio-Ready Demo Value

Combines visible CV output and real OS interaction output, making the project strong for demos and technical interviews.