Computer Vision (CV)

Let us start with a simple definition: Predicting the location of the object along with the class is called object Detection. In place of predicting the class of object from an image, we now have to predict the class as well as a rectangle(called bounding box) containing that object. It takes 4 variables to uniquely identify a rectangle. Object Detection is modeled as a classification problem where we take windows of fixed sizes from input image at all the possible locations feed these patches to an image classifier. Each window is fed to the classifier which predicts the class of the object in the window( or background if none is present). There are various methods for object detection like RCNN, Faster-RCNN, SSD, YOLO etc.

Module 1: Face detection with OpenCV

(a) - The Viola-Jones algorithm

This is one of the most powerful to date algorithms for computer vision developed by P. Viola and M. Joens. This algorithm lies at the foundation of OpenCV library. check the version

pkg-config --modversion opencv

3.2.0

in case if it is not found, try sudo apt-get install libopencv-devsudo. For this virtual-environment I have python3.6. All the libraries and dependencies verion can be find in environment.yml. Main libraries installed with:

$ pip install torchvision==0.1.6
$ pip3 install torch==0.3.1
$ conda install -c menpo opencv3

(b) - Emotion detection

Many companies today use CV in their core business to detect emotions. For example, Apple bought Emotient, a startup that builds CV tools to recognize people's feelings. Building an AI that sees human emotions can be highly valuable in some markets, like recomender system or self-driving car. Here is an example to detect one motion: Happiness :)

Additional reading:

Paul Viola & Michael Jones, 2001 Rapid bject Detection using a Boosted Cascade of Simple Features
Kinh Tieu & Paul Viola, 2000 Boosting Image Retrieval

Module 2: Object Detection With SSD

Single Shot Detector(SSD):

Single Shot Detector achieves a good balance between speed and accuracy. SSD runs a convolutional network on input image only once and calculates a feature map. Now, we run a small 3×3 sized convolutional kernel on this feature map to predict the bounding boxes and classification probability. SSD also uses anchor boxes at various aspect ratio similar to Faster-RCNN and learns the off-set rather than learning the box. In order to handle the scale, SSD predicts bounding boxes after multiple convolutional layers. Since each convolutional layer operates at a different scale, it is able to detect objects of various scales.

That’s a lot of algorithms. Which one should you use? Currently, Faster-RCNN is the choice if you are fanatic about the accuracy numbers. However, if you are strapped for computation(probably running it on Nvidia Jetsons), SSD is a better recommendation. Finally, if accuracy is not too much of a concern but you want to go super fast, YOLO will be the way to go. First of all a visual understanding of speed vs accuracy trade-off:

Install the library imageio

pip install imageio
pip install imageio-ffmpeg
2.9.0

Pre-trained data set is available at VOC Dataset, PASCAL Visual Object Classes

Refrence Wei Liu et al., 2015 SSD: Single Shot MultiBox Detector

Module 3: Image Creation with DCGANs

Generative Adversarial Network (GAN) can generate images from a learned latent space. A GAN is one of the simplest neural-based models that implements adversarial learning, and was initially conceived in a bar in Montreal by Ian Goodfellow and collaborators (Goodfellow, I., et al. (2014)). It is based on a min-max optimization problem. Here is example of deep Convolutional GAN https://towardsdatascience.com/understanding-generative-adversarial-networks-gans-cd6e4651a29

GANs can be used for:

generating images
image modification
super resolution
assisting asrtist
speech generation
face ageing

Additional reading:

Chanchana Sornsoontorn, 2017 How do GANs intuitively work?
Ian Goodfellow et al., 2014 Generative Adversarial Nets
Matthew D. Zeiler et al., 2011 Adaptive Deconvolutional Networks for Mid and High Level Feature Learning

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
face_detection_OpenCV		face_detection_OpenCV
image		image
image_creation		image_creation
object_detection_SSD		object_detection_SSD
LICENSE		LICENSE
README.md		README.md
emotion_detection.py		emotion_detection.py
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computer Vision (CV)

Module 1: Face detection with OpenCV

(a) - The Viola-Jones algorithm

(b) - Emotion detection

Module 2: Object Detection With SSD

Single Shot Detector(SSD):

Module 3: Image Creation with DCGANs

About

Releases

Packages

Languages

License

Foroozani/ComputerVision

Folders and files

Latest commit

History

Repository files navigation

Computer Vision (CV)

Module 1: Face detection with OpenCV

(a) - The Viola-Jones algorithm

(b) - Emotion detection

Module 2: Object Detection With SSD

Single Shot Detector(SSD):

Module 3: Image Creation with DCGANs

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages