Added wrappers/dlib/face example

* BUILD_DLIB_EXAMPLES boolean in CMake to enable * Requires installing dlib separately and pointing to it with DLIB_DIR * wrappers/dlib/face example is listed in examples/readme.md * Fixed typo in rs-measure readme
OhadMeir · Oct 2, 2019 · 9c80e6e · 9c80e6e
1 parent 5096aab
commit 9c80e6e
Show file tree

Hide file tree

Showing 14 changed files with 700 additions and 15 deletions.
diff --git a/CMake/lrs_options.cmake b/CMake/lrs_options.cmake
@@ -22,6 +22,7 @@ option(BUILD_CSHARP_BINDINGS "Build C# bindings" OFF)
 option(BUILD_MATLAB_BINDINGS "Build Matlab bindings" OFF)
 option(BUILD_UNITY_BINDINGS "Copy the unity project to the build folder with the required dependencies" OFF)
 option(BUILD_CV_EXAMPLES "Build OpenCV examples" OFF)
+option(BUILD_DLIB_EXAMPLES "Build DLIB examples - requires DLIB_DIR" OFF)
 option(BUILD_PCL_EXAMPLES "Build PCL examples" OFF)
 option(BUILD_NODEJS_BINDINGS "Build Node.js bindings" OFF)
 option(BUILD_OPENNI2_BINDINGS "Build OpenNI bindings" OFF)

diff --git a/examples/measure/readme.md b/examples/measure/readme.md
@@ -2,8 +2,8 @@
 
 ## Overview
 
-This tutorial shows simple method for measuring real-world distances using depth data. 
-> **Note:** Measuring dimensions of real-world objects is one of the obvious applications of a depth camera. This sample is not indented to be a proper measurement tool, but rather to showcase critical concepts.
+This tutorial shows simple method for measuring real-world distances using depth data.
+> **Note:** Measuring dimensions of real-world objects is one of the obvious applications of a depth camera. This sample is not intended to be a proper measurement tool, but rather to showcase critical concepts.
 > With better algorithms measurement results can be improved considerably.
 
 In this tutorial you will learn how to:
@@ -20,7 +20,7 @@ In this tutorial you will learn how to:
 
 This demo lets the user measure distance between two points in the physical world.
 
-## Code Overview 
+## Code Overview
 
 ### Depth Processing Pipeline
 
@@ -43,7 +43,7 @@ rs2::disparity_transform disparity2depth(false);
 rs2::spatial_filter spat;
 // Enable hole-filling
 // Hole filling is an aggressive heuristic and it gets the depth wrong many times
-// However, this demo is not built to handle holes 
+// However, this demo is not built to handle holes
 // (the shortest-path will always prefer to "cut" through the holes since they have zero 3D distance)
 spat.set_option(RS2_OPTION_HOLES_FILL, 5); // 5 = fill all the zero pixels
 // Define temporal filter
@@ -72,13 +72,13 @@ The best way to reduce the number of missing pixels is by letting the hardware d
 The D400 cameras have a **High Density** preset we can take advantage of.
 ```cpp
 auto sensor = profile.get_device().first<rs2::depth_sensor>();
-    
+
 // Set the device to High Accuracy preset
 auto sensor = profile.get_device().first<rs2::depth_sensor>();
 sensor.set_option(RS2_OPTION_VISUAL_PRESET, RS2_RS400_VISUAL_PRESET_HIGH_ACCURACY);
 ```
 
-Given a frame-set, we are going to apply all the processing blocks in order. 
+Given a frame-set, we are going to apply all the processing blocks in order.
 First we apply the `align` processing block to align color frame to depth viewport:
 ```cpp
 // First make the frames spatially aligned
@@ -90,7 +90,7 @@ Next, we invoke the depth post-processing flow:
 rs2::frame depth = data.get_depth_frame();
 // Decimation will reduce the resultion of the depth image,
 // closing small holes and speeding-up the algorithm
-depth = dec.process(depth); 
+depth = dec.process(depth);
 // To make sure far-away objects are filtered proportionally
 // we try to switch to disparity domain
 depth = depth2disparity.process(depth);
@@ -122,7 +122,7 @@ auto stream = profile.get_stream(RS2_STREAM_DEPTH).as<rs2::video_stream_profile>
 auto intrinsics = stream.get_intrinsics(); // Calibration data
 ```
 
-Distance in meters can be acquired using `get_distance` function of `depth_frame` class. 
+Distance in meters can be acquired using `get_distance` function of `depth_frame` class.
 
 > Calling `get_distance` excessively can result in bad performance, since the compiler can't optimize across module boundaries, so it could be beneficial to read the `DEPTH_UNITS` option directly from the `depth_sensor` and use it to convert raw depth pixels to meters.
 
@@ -163,7 +163,7 @@ float dist_3d(const rs2_intrinsics& intr, const rs2::depth_frame& frame, pixel u
 
 ### Running Processing on a Background Thread
 
-Post-processing calculations in this example can be relatively slow. To not block the main (UI) thread, we are going to have a dedicated thread for post-processing. 
+Post-processing calculations in this example can be relatively slow. To not block the main (UI) thread, we are going to have a dedicated thread for post-processing.
 
 #### Video-Processing Thread
 
@@ -173,11 +173,11 @@ while (alive)
 {
     // Fetch frames from the pipeline and send them for processing
     rs2::frameset fs;
-    if (pipe.poll_for_frames(&fs)) 
+    if (pipe.poll_for_frames(&fs))
     {
         // Apply post processing
         // ...
-        
+
         // Send resulting frames for visualization in the main thread
         postprocessed_frames.enqueue(data);
     }
@@ -219,7 +219,7 @@ while(app) // Application still alive?
 		// Render the ruler
 		app_state.ruler_start.render(app);
 		app_state.ruler_end.render(app);
-			
+
 		glDisable(GL_BLEND);
 	}
 }
@@ -228,6 +228,5 @@ We use `glBlendFunc` to overlay aligned-color on top of depth using colors alpha
 
 ----------------------------------
 
-This example demonstrates a short yet complex processing flow. Each thread has somewhat different rate and they all need to synchronize but not block one another. 
+This example demonstrates a short yet complex processing flow. Each thread has somewhat different rate and they all need to synchronize but not block one another.
 This is achieved using thread-safe `frame_queue`s as synchronization primitives and `rs2::frame` reference counting for object lifetime management across threads.
-
diff --git a/examples/readme.md b/examples/readme.md
@@ -1,5 +1,4 @@
 
-
 # Sample Code for Intel® RealSense™ cameras
 **Code Examples to start prototyping quickly:** These simple examples demonstrate how to easily use the SDK to include code snippets that access the camera into your applications.  
 
@@ -20,6 +19,7 @@ For a detailed explanations and API documentation see our [Documentation](../doc
 |[Pose](./pose)|C++|Demonstrates how to obtain data from pose frames| :star: |[![Motion Tracking - T260 and SLAM](https://img.shields.io/badge/-Tracking-0e2356.svg)](../doc/t265.md#examples-and-tools)|
 |[ImShow](../wrappers/opencv/imshow) | C++ & [OpenCV](https://github.com/IntelRealSense/librealsense/tree/master/wrappers/opencv#getting-started) | Minimal OpenCV application for visualizing depth data | :star: | [![Depth Sensing - Structured Light, Stereo and L500](https://img.shields.io/badge/-Depth-5bc3ff.svg)](./depth.md)|
 |[Multicam](./multicam)| C++ | Present multiple cameras depth streams simultaneously, in separate windows | :star: | [![Depth Sensing - Structured Light, Stereo and L500](https://img.shields.io/badge/-Depth-5bc3ff.svg)](./depth.md) [![Motion Tracking - T260 and SLAM](https://img.shields.io/badge/-Tracking-0e2356.svg)](../doc/t265.md#examples-and-tools) |
+|[Face](../wrappers/dlib/face)| C++ & [Dlib](https://github.com/IntelRealSense/librealsense/tree/master/wrappers/dlib) | Facial recognition with simple anti-spoofing | :star: | [![Depth Sensing - Structured Light, Stereo and L500](https://img.shields.io/badge/-Depth-5bc3ff.svg)](./depth.md)  |
 |[Depth](./C/depth) | C | Demonstrates how to stream depth data and prints a simple text-based representation of the depth image | :star::star: | [![Depth Sensing - Structured Light, Stereo and L500](https://img.shields.io/badge/-Depth-5bc3ff.svg)](./depth.md)|
 |[Spatial Alignment](./align)| C++ | Introduces the concept of spatial stream alignment, using depth-color mapping | :star::star: | [![Depth Sensing - Structured Light, Stereo and L500](https://img.shields.io/badge/-Depth-5bc3ff.svg)](./depth.md) |
 |[Advanced Alignment](./align-advanced)| C++ | Show a simple method for dynamic background removal from video | :star::star: | [![Depth Sensing - Structured Light, Stereo and L500](https://img.shields.io/badge/-Depth-5bc3ff.svg)](./depth.md) |

diff --git a/wrappers/CMakeLists.txt b/wrappers/CMakeLists.txt
@@ -17,6 +17,10 @@ if (BUILD_CV_EXAMPLES)
   add_subdirectory(opencv)
 endif()
 
+if (BUILD_DLIB_EXAMPLES)
+  add_subdirectory(dlib)
+endif()
+
 if(BUILD_MATLAB_BINDINGS)
     add_subdirectory(matlab)
 endif()

diff --git a/wrappers/dlib/CMakeLists.txt b/wrappers/dlib/CMakeLists.txt
@@ -0,0 +1,18 @@
+#  minimum required cmake version: 3.1.0
+cmake_minimum_required(VERSION 3.1.0)
+
+project(RealsenseDlibSamples)
+
+# Add DLIB_DIR as an option -- this needs to be configured in CMake
+set(DLIB_DIR "" CACHE PATH "The path to the DLIB installation")
+
+# Actually make it part of the compilation
+add_subdirectory(${DLIB_DIR} build)
+
+# Add it as the source of includes??
+set(DEPENDENCIES realsense2 dlib::dlib)
+include_directories( ../../include )
+include_directories( ${DLIB_DIR} )
+
+# List all the specific examples for dlib
+add_subdirectory(face)
diff --git a/wrappers/dlib/face/CMakeLists.txt b/wrappers/dlib/face/CMakeLists.txt
@@ -0,0 +1,26 @@
+#  minimum required cmake version: 3.1.0
+cmake_minimum_required(VERSION 3.1.0)
+
+project(RealSenseDlibFaceExample)
+
+add_executable(rs-face-dlib
+	rs-face-dlib.cpp
+	../rs_frame_image.h
+	markup_68.h
+	validate_face.h
+	render_face.h
+)
+set_property(TARGET rs-face-dlib PROPERTY CXX_STANDARD 11)
+target_link_libraries(rs-face-dlib ${DEPENDENCIES})
+set_target_properties (rs-face-dlib PROPERTIES
+	FOLDER "Examples/dlib"
+)
+
+install(
+	TARGETS
+
+	rs-face-dlib
+
+	RUNTIME DESTINATION
+	${CMAKE_INSTALL_PREFIX}/bin
+)
diff --git a/wrappers/dlib/face/markup_68.h b/wrappers/dlib/face/markup_68.h
@@ -0,0 +1,43 @@
+// License: Apache 2.0. See LICENSE file in root directory.
+// Copyright(c) 2019 Intel Corporation. All Rights Reserved.
+
+#pragma once
+
+
+/*
+    The 68-point annotations for the iBUG 300-W face landmark dataset.
+    See this picture:
+        https://ibug.doc.ic.ac.uk/media/uploads/images/annotpics/figure_68_markup.jpg
+    NOTE: the indexes in the picture are 1-based, so the actual C++ indexes are less 1.
+    NOTE: "Right" and "left" refer to the face being described, so are the mirror of the
+    side that an onlooker (from the front) would see.
+*/
+enum markup_68
+{
+    // Starting with right ear, the jaw [1-17]
+    RIGHT_EAR, JAW_FROM = RIGHT_EAR, RIGHT_JAW_FROM = RIGHT_EAR,
+    RIGHT_1, RIGHT_2, RIGHT_3, RIGHT_4, RIGHT_5, RIGHT_6, RIGHT_7, RIGHT_JAW_TO = RIGHT_7,
+    CHIN, CHIN_FROM = CHIN - 1, CHIN_TO = CHIN + 1,
+    LEFT_7 = CHIN + 1, LEFT_JAW_FROM = LEFT_7, LEFT_6, LEFT_5, LEFT_4, LEFT_3, LEFT_2, LEFT_1,
+    LEFT_EAR, LEFT_JAW_TO = LEFT_EAR, JAW_TO = LEFT_EAR,
+
+    // Eyebrows [18-22] and [23-27]
+    RIGHT_EYEBROW_R, RIGHT_EYEBROW_FROM = RIGHT_EYEBROW_R, RIGHT_EYEBROW_1, RIGHT_EYEBROW_2, RIGHT_EYEBROW_3, RIGHT_EYEBROW_L, RIGHT_EYEBROW_TO = RIGHT_EYEBROW_L,
+    LEFT_EYEBROW_R, LEFT_EYEBROW_FROM = LEFT_EYEBROW_R, LEFT_EYEBROW_1, LEFT_EYEBROW_2, LEFT_EYEBROW_3, LEFT_EYEBROW_L, LEFT_EYEBROW_TO = LEFT_EYEBROW_L,
+
+    // Nose [28-36]
+    NOSE_RIDGE_TOP, NOSE_RIDGE_FROM = NOSE_RIDGE_TOP, NOSE_RIDGE_1, NOSE_RIDGE_2, NOSE_TIP, NOSE_RIDGE_TO = NOSE_TIP,
+    NOSE_BOTTOM_R, NOSE_BOTTOM_FROM = NOSE_BOTTOM_R, NOSE_BOTTOM_1, NOSE_BOTTOM_2, NOSE_BOTTOM_3, NOSE_BOTTOM_L, NOSE_BOTTOM_TO = NOSE_BOTTOM_L,
+
+    // Eyes [37-42] and [43-48]
+    RIGHT_EYE_R, RIGHT_EYE_FROM = RIGHT_EYE_R, RIGHT_EYE_1, RIGHT_EYE_2, RIGHT_EYE_L, RIGHT_EYE_4, RIGHT_EYE_5, RIGHT_EYE_TO = RIGHT_EYE_5,
+    LEFT_EYE_R, LEFT_EYE_FROM = LEFT_EYE_R, LEFT_EYE_1, LEFT_EYE_2, LEFT_EYE_L, LEFT_EYE_4, LEFT_EYE_5, LEFT_EYE_TO = LEFT_EYE_5,
+
+    // Mouth [49-68]
+    MOUTH_R, MOUTH_OUTER_R = MOUTH_R, MOUTH_OUTER_FROM = MOUTH_OUTER_R, MOUTH_OUTER_1, MOUTH_OUTER_2, MOUTH_OUTER_TOP, MOUTH_OUTER_4, MOUTH_OUTER_5,
+    MOUTH_L, MOUTH_OUTER_L = MOUTH_L, MOUTH_OUTER_7, MOUTH_OUTER_8, MOUTH_OUTER_BOTTOM, MOUTH_OUTER_10, MOUTH_OUTER_11, MOUTH_OUTER_TO = MOUTH_OUTER_11,
+    MOUTH_INNER_R, MOUTH_INNER_FROM = MOUTH_INNER_R, MOUTH_INNER_1, MOUTH_INNER_TOP, MOUTH_INNER_3,
+    MOUTH_INNER_L, MOUTH_INNER_5, MOUTH_INNER_BOTTOM, MOUTH_INNER_7, MOUTH_INNER_TO = MOUTH_INNER_7,
+
+    N_POINTS
+};
diff --git a/wrappers/dlib/face/readme.md b/wrappers/dlib/face/readme.md
@@ -0,0 +1,99 @@
+# rs-face-dlib Sample
+
+## Overview
+This example demonstrates a very simple facial landmark detection using
+[Dlib](http://dlib.net/)'s machine learning algorithms, using depth data to
+implement basic anti-spoofing.
+
+> **Note:** This is just an example intended to showcase possible applications.
+> The heuristics used here are very simple and basic, and can be greatly
+> improved on.
+
+<p align="center"><img src="rs-face-dlib.jpg" alt="screenshot"/></p>
+
+Faces detected by the camera will show landmarks in either green or red:
+
+* Green landmarks indicate a "real" face with depth data corroborating
+  expectations. E.g., the distance to the eyes should be greater than to the tip
+  of the nose, etc.
+
+* Red landmarks indicate a "fake" face. E.g., a *picture* of a face, where the
+  distance to each facial feature does not meet expectations
+
+> Note: faces should be forward-facing to be detectable
+
+## Implementation
+
+To enable usage of librealsense frame data as a dlib image, a `rs_frame_image`
+class is introduced. No copying of frame data takes place.
+
+```cpp
+rs_frame_image< dlib::rgb_pixel, RS2_FORMAT_RGB8 > image( color_frame );
+```
+
+Faces are detected in two steps:
+
+1. Facial boundary rectangles are detected:
+```cpp
+  dlib::frontal_face_detector face_bbox_detector = dlib::get_frontal_face_detector();
+  ...
+  std::vector< dlib::rectangle > face_bboxes = face_bbox_detector( image );
+```
+
+2. Each one is then annotated to find its landmarks:
+```cpp
+  dlib::shape_predictor face_landmark_annotator;
+  dlib::deserialize( "shape_predictor_68_face_landmarks.dat" ) >> face_landmark_annotator;
+  ...
+  std::vector< dlib::full_object_detection > faces;
+  for( auto const & bbox : face_bboxes )
+    faces.push_back( face_landmark_annotator( image, bbox ));
+```
+
+> **Note:** A dataset (a trained model file) is required to run this sample.
+> You can use a dataset of your choosing/choosing or download
+the [68-point trained model file from dlib's site](http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2).
+
+The landmarks calculated are 68-point 2D coordinates in the color frame. See
+[this picture](https://ibug.doc.ic.ac.uk/media/uploads/images/annotpics/figure_68_markup.jpg)
+for (1-based) indexes for each of the points.
+
+Once available, landmarks are used to get average eye, nose, ear, mouth, and
+chin depths. The distances are compared with expected relationships:
+
+```cpp
+if( nose_depth >= eye_depth )
+    return false;
+if( eye_depth - nose_depth > 0.07f )
+    return false;
+if( ear_depth <= eye_depth )
+    return false;
+if( mouth_depth <= nose_depth )
+    return false;
+if( mouth_depth > chin_depth )
+    return false;
+
+// All the distances, collectively, should not span a range that makes no sense. I.e.,
+// if the face accounts for more than 20cm of depth, or less than 2cm, then something's
+// not kosher!
+float x = std::max( { nose_depth, eye_depth, ear_depth, mouth_depth, chin_depth } );
+float n = std::min( { nose_depth, eye_depth, ear_depth, mouth_depth, chin_depth } );
+if( x - n > 0.20f )
+    return false;
+if( x - n < 0.02f )
+    return false;
+```
+
+Because depth data is used to correspond with landmark pixels from the
+color image, it is important that the depth and color frames from the camera
+are aligned:
+
+```cpp
+rs2::align align_to_color( RS2_STREAM_COLOR );
+...
+rs2::frameset data = pipe.wait_for_frames();
+data = align_to_color.process( data );       // Replace with aligned frames
+```
+
+The depth data is accessed directly, for speed, instead of relying
+`depth_frame.get_distance()` which may incur overhead.
diff --git a/wrappers/dlib/face/render_face.h b/wrappers/dlib/face/render_face.h
@@ -0,0 +1,60 @@
+// License: Apache 2.0. See LICENSE file in root directory.
+// Copyright(c) 2019 Intel Corporation. All Rights Reserved.
+
+#pragma once
+
+#include <dlib/image_processing/full_object_detection.h>
+#include <dlib/gui_widgets.h>
+#include <vector>
+#include "markup_68.h"
+
+
+/*
+    Return lines rendering facial landmarks for a 68-point detection, that can
+    easily be used with dlib::image_window::add_overlay().
+
+    Based on dlib::render_face_detections(), made a little simpler.
+*/
+std::vector< dlib::image_window::overlay_line > render_face(
+    dlib::full_object_detection const & face,
+    dlib::rgb_pixel const & color
+)
+{
+    typedef dlib::image_window::overlay_line overlay_line;
+    std::vector< overlay_line > lines;
+
+    // Around Chin. Ear to Ear
+    for( unsigned long i = markup_68::JAW_FROM; i < markup_68::JAW_TO; ++i )
+        lines.push_back( overlay_line( face.part( i ), face.part( i + 1 ), color ) );
+
+    // Nose
+    for( unsigned long i = markup_68::NOSE_RIDGE_FROM; i < markup_68::NOSE_RIDGE_TO; ++i )
+        lines.push_back( overlay_line( face.part( i ), face.part( i + 1 ), color ) );
+    lines.push_back( overlay_line( face.part( markup_68::NOSE_TIP ), face.part( markup_68::NOSE_BOTTOM_FROM ), color ) );
+    lines.push_back( overlay_line( face.part( markup_68::NOSE_TIP ), face.part( markup_68::NOSE_BOTTOM_TO ), color ) );
+
+    // Left eyebrow
+    for( unsigned long i = markup_68::RIGHT_EYEBROW_FROM; i < markup_68::RIGHT_EYEBROW_TO; ++i )
+        lines.push_back( overlay_line( face.part( i ), face.part( i + 1 ), color ) );
+
+    // Right eyebrow
+    for( unsigned long i = markup_68::LEFT_EYEBROW_FROM; i < markup_68::LEFT_EYEBROW_TO; ++i )
+        lines.push_back( overlay_line( face.part( i ), face.part( i + 1 ), color ) );
+
+    // Right eye
+    for( unsigned long i = markup_68::RIGHT_EYE_FROM; i < markup_68::RIGHT_EYE_TO; ++i )
+        lines.push_back( overlay_line( face.part( i ), face.part( i + 1 ), color ) );
+    lines.push_back( overlay_line( face.part( markup_68::RIGHT_EYE_FROM ), face.part( markup_68::RIGHT_EYE_TO ), color ) );
+
+    // Left eye
+    for( unsigned long i = markup_68::LEFT_EYE_FROM; i < markup_68::LEFT_EYE_TO; ++i )
+        lines.push_back( overlay_line( face.part( i ), face.part( i + 1 ), color ) );
+    lines.push_back( overlay_line( face.part( markup_68::LEFT_EYE_FROM ), face.part( markup_68::LEFT_EYE_TO ), color ) );
+
+    // Lips inside part
+    for( unsigned long i = markup_68::MOUTH_INNER_FROM; i < markup_68::MOUTH_INNER_TO; ++i )
+        lines.push_back( overlay_line( face.part( i ), face.part( i + 1 ), color ) );
+    lines.push_back( overlay_line( face.part( markup_68::MOUTH_INNER_FROM ), face.part( markup_68::MOUTH_INNER_TO ), color ) );
+
+    return lines;
+}