A collection of JavaScript examples showcasing Google's Mediapipe API.
The examples focus on hand and body tracking for fun and experimentation.
We also demonstrate how to combine Body with Hand tracking, for associating hands with Google's Tasks Vision API
Try them in you browser now! MediaPipe Playground
Or follow the instructions below to build locally.
- Node (latest LTS) - downloadable from: https://nodejs.org/en/download
- A webcam (laptop integrated or USB)
npm installto initialise the projectnpm startto run a development web servernpm run buildto package the files into adistdirectorynpm run previewto server the built files for testing
All examples are designed to run in the browser, and assume you have given access to a webcam.
If you don't see a camera image as expected, click the Settings button in the top-left,
then select the device from dropdown.
This example shows how to explore a ThreeJS 3D Model with your hand. Try it now: Spin The Shark
- Hold one hand in front of your chest and a Tiny Hand should appear!
- Pinch and drag to spin the shark!
This example shows how to warp through space, using five finger poses. Try it now: Warp Fingers
- Hold one hand in front of your chest and a Tiny Hand should appear!
- Form a finger pose by holding out 1-5 fingers.
- Hold the pose until you warp through space!
This example shows how to explore a 2D World Map using your hands. Try it now: World in your Hands
- Hold one hand in front of your chest and a Tiny Hand should appear!
- Point up with your finger and move the tiny hand to highlight a country.
- If you bring your index and thumb together, to form a 'pinch' you will be able to drag the map.
- Raise a second hand and pinch, then move your hands together/apart to zoom the map.
Examples of demo scripts and use cases for MediaPipe controlling UI are below:
MediaPipeMultiHandInput.jsandMouseLikePointerInputinterfaces withMediaPipeServiceProvider.jsto receive a stream of tracking data for multiple people. The event data contains body, head and hand information for up to two people. The active person generates pointer events for up to two touches.
MediaPipeWarpFingerPoses.jsinterfaces withMediaPipeServiceProvider.jsto receive a stream of tracking data for multiple people and detected gestures for the active user. The gesture detection is connected directly to screen changes and icon graphics updates in this script.
To add MediaPipe Tracking to your HTML example, source this script: scripts/MediaPipeTracking.js
Then add the <mediapipe-tracking> component to your HTML
See the code in Examples Overview for usage examples.
Management of the MediaPipe system is handled by a set of Service Provider wrapper classes to facilitate reuse of image frames from the webcam and manage multi-body detectoin.
Update of the tracking data is alternated between body and hand tracking to reduce system load.
Extrapolation is used so hand tracking updates every frame, even when lacking new data.
MediaPipeServiceProvider.jsis the entry point to the system. This manages accessing the webcam images, starting the individual body pose and hands services, as well as controlling the flow of updates to the tracking data.MediaPipeTasksVisionBodyPoseService.jsconfigures the body tracking system to work with up to two bodies. The service uses the body pose processor to generate a data structure for each of the people in the camera image.MediaPipeTasksVisionHandsService.jsconfigures the hand tracking system to work with up to four hands. The hands are associated with tracked bodies to enable an active person to be assigned using the wake gesture.
MediaPipeBodyPoseProcessor.jstakes raw tracking landmark data and processes the relevant data for use by the system. For each tracked body we generate an interaction rect around the shoulders that is used to detect active hands and convert cursor positions to screen space.MediaPipeHandsProcessor.jstakes raw tracking landmark data and processes the relevant data for use by the system. For each tracked hand we generate a stable pinch position for pointer interactions, which is smoothed and deadzoned to remove noise from the tracking data. Hand data is organised into a heirarchy of fingers to make information easier to parse. The hand processor also manages pinch and gesture detection for each hand.
MediaPipeFingerPoseDetector.jsuses hand data to determine how many fingers are extended. Pose detection requires the palm to be facing the camera to filter out unwanted activations.MediaPipeGrabDetector.jsuses hand data to determine if the hand is in a closed grab (fist) pose.MediaPipeHeadDirectionDetector.jsuses body tracking data to determine the direction the head is facing for a given person.



