____
| _ \ __ _ _ __ ___ _ __ __ _ _ __ ___ __ _
| |_) / _` | '_ \ / _ \| '__/ _` | '_ ` _ \ / _` |
| __/ (_| | | | | (_) | | | (_| | | | | | | (_| |
|_| \__,_|_| |_|\___/|_| \__,_|_| |_| |_|\__,_|
Panoramic image stitching with overlapping images using SIFT detector, Homography, RANSAC algorithm and weighted blending.
-
git clone https://github.com/stanleyedward/panorama-image-stitching.git cd panorama-image-stitching
-
conda env create -f environment.yml conda activate panorama
-
mv left.jpg middle.jpg right.jpg inputs/
dont have any images? try the preloaded ones located in
inputs/
-
python panorama.py inputs/front/front_01.jpeg inputs/front/front_02.jpeg inputs/front/front_03.jpeg
Caution:
The sequence of images should be orderedleft to right
from the viewing point. -
if your results are unsatisfactory consider increasing or decreasing the
smoothing_window_percent
value inline 13
of image_stitching/image_stitching.pythe output should be exported at
outputs/paranorama_image.jpg
This is the output of the following command:
python panorama.py inputs/back/back_01.jpeg inputs/back/back_02.jpeg inputs/back/back_03.jpeg
____ | _ \ __ _ _ __ ___ _ __ __ _ _ __ ___ __ _ | |_) / _` | '_ \ / _ \| '__/ _` | '_ ` _ \ / _` | | __/ (_| | | | | (_) | | | (_| | | | | | | (_| | |_| \__,_|_| |_|\___/|_| \__,_|_| |_| |_|\__,_| Initializing... Panoramic image saved at: outputs/panorama_image.jpg
-
Distinctive Image Features from Scale-Invariant Keypoints (SIFT)
-
https://github.com/Yunyung/Automatic-Panoramic-Image-Stitching
The scale-invariant feature transform is a computer vision algorithm to detect interest points, describe, and match local features in images. [David Lowe 1999]
The image is convolved with a series of Gaussian filters at different scales to create a scale-space representation. Local extrema in this scale-space are identified as potential key points. Therefore, the scale space of an image is defined as a function,
where
To efficiently detect stable keypoint locations in scale space,
using scale-space extrema in the difference-of-Gaussian function convolved with the image,
In addition, the difference-of-Gaussian function provides a close approximation to the scale-normalized Laplacian of Gaussian, σ2∇2G, as studied by Lindeberg (1994). and therefore,
the Laplacian of Gaussian is used for feature detection by highlighting regions of rapid intensity change in an image, it is often applied to identify key points or interest points in an image.
Fit a 3D quadratic function to the nearby DoG extrema to achieve subpixel precision. Eliminate low-contrast keypoints and poorly localized keypoints along edges.
Compute gradient magnitude and orientation around each keypoint. Construct histograms to determine the dominant orientation. Keypoints are assigned orientations based on the histogram peaks.
By analyzing the gradient orientation around a keypoint, SIFT ensures that the descriptor is invariant to rotation. The gradient information is used to construct a descriptor that captures the local structure around the keypoint.
In feature matching, the primary objective is to establish correspondences between keypoints detected in different images. This process is fundamental in tasks such as image stitching, object recognition, and 3D reconstruction.
Keypoints are characterized by descriptors, which are feature vectors representing the local image information around each keypoint. A common approach is to use a distance metric, such as Euclidean distance, to measure the similarity or dissimilarity between the descriptors of two keypoints. Smaller distances indicate higher similarity.
The L2 norm is calculated using the Euclidean distance formula, which is the square root of the sum of squared differences between corresponding elements of two vectors.
Let's denote the descriptor vectors of two keypoints as
A homography matrix, often denoted as
The homography matrix is a 3x3 matrix and can be represented as:
Homography Transformation Equation:
- It helps align and transform the images correctly, ensuring that corresponding points in different images are mapped to the same coordinates in the final panorama.
How exactly do you find the values for your Homography Matrix? RANSAC comes to the rescue! RANSAC (Random Sample Consensus) is an iterative algorithm commonly used in computer vision to estimate a model's parameters from a set of data points containing outliers. In the context of estimating a homography matrix, RANSAC is often used when dealing with correspondences between points in two images that may include outliers or mismatches.
The steps of RANSAC algorithm are as follows:
1. Sample(Randomly) the number of points required to fit the model (Homography), for our purpose the number is 4, to fit the model.
2. Fit the model to the randomly chosen samples
3. Count the number M of datapoints (inliers) that fit the model within a measure or error E, ie acceptable alignment error of pixels.
4. Repeat the steps 1-3 N times.
5. Choose the model that has the largest No of M inliers
note:
The number of outliers needs to be < 50% for RANSAC to work.
Hard seams may arise due to vignetting, exposure differences, illumination differences. Averaging the images doesn't solve the issue and seams might still be visible. Therefore weighted blending comes into use.
Weight blending using masks involves blending images based on pixel weights assigned via masks. Masks are binary images where each pixel value determines the contribution or weight of the corresponding pixel in the blending process.
The mask should have the same dimensions as the image, and each pixel in the mask is assigned a weight value between 0 and 1. The mask is then normalized to maintain the overall color intensity and brightness during blending.
The purpose of weighted blending is to create a seamless transition between overlapping regions of images, taking into account the relative importance or contribution of each pixel, allowing for a smooth and controlled transition in overlapping regions.