ENPM673-Perception-for-Autonomous-Robots

This repository serves as a record of my academic journey in ENPM673 during the Spring of 2023. It includes my solutions and code submissions for all projects. Each project has its dedicated folder with accompanying documentation and resources.

📚 Course Overview

The Perception course delves into Classic Computer Vision principles and fundamental deep learning techniques. The curriculum emphasizes enhancing autonomous systems like robots, self-driving cars, and smart cameras. Hands-on projects cover practical applications, such as lane detection and constructing 3D models from 2D images. The course aims to provide a comprehensive understanding of perception in autonomous systems, blending theoretical knowledge with practical skills.

Curve Fitting and Trend Analysis:
- Optimal trend line identification for a set of data points through curve fitting.
Image Feature Recognition:
- Recognition of key features in images, including corners, edges, and straight lines.
3D Object Estimation:
- Estimation of 3D information for objects based on their 2D images.
Object Motion Metrics:
- Calculation of motion metrics for objects, covering speed and direction using camera feeds.
Camera Pose Estimation:
- Conducting camera pose estimation for spatial understanding.
Basic Image-based Machine Learning:
- Application of fundamental machine learning techniques to image-related tasks.

The course structure includes four distinct projects, each outlined below.

📄 Project List

Click here access ENPM-673 Final Project .

Project 1: Object Tracking and Covariance Matrix, LS, TLS, and RANSAC implementaions for 3D Point Cloud

Implementation and Learnings from Project 1:

Object Tracking: Implemented ball tracking to follow the trajectory of a red ball thrown against a wall.
- Video captured using cv2.VideoCapture, and frames processed in a loop.
- Color channels converted from BGR to HSV using cv2.cvtColor.
- Red color channel isolated using cv2.inRange with specified upper and lower thresholds.
- Pixel coordinates of the ball's center calculated by finding the mean of x and y coordinates.
- Best-fit curve determined using the least squares method for pixel coordinates.
- Least Squares Method: Utilized the least squares method to find the best-fit curve (parabola) by minimizing mean square error.
- Ball tracking
- Best Fit curve to predict trajectory in pixel coordinates
Covariance Matrix, LS, TLS, and RANSAC for 3D Point Cloud: Explored methods for fitting surfaces to 3D point cloud data.
- Covariance Matrix and Surface Normal: Calculated covariance matrix and determined surface normal's direction and magnitude using eigenvalues and eigenvectors.
- Standard Least Squares Method for 3D Point Cloud: Applied standard least squares method to find the best-fit surface plane.
  - Standard Least squares fit
- Total Least Squares Method for 3D Point Cloud: Used total least squares method to find the best-fit plane by minimizing error orthogonal to the plane.
  - Total Least Squares fit
- RANSAC Method: Implemented RANSAC for robust surface fitting, handling outliers in the data.
  - RANSAC fit
- Observations and Interpretation of Results:
  - Total least squares method outperformed least squares method, especially in noisy data.
  - RANSAC demonstrated superior accuracy in generating models, especially with outlier rejection.
- Problems Encountered:
  1. Challenges in determining threshold limits for ball tracking.
  2. Issues with eigen vector assignment in the total least squares method.
  3. Error during RANSAC due to probability values resulting in a denominator of zero.
  4. Complexity of RANSAC algorithm required referencing multiple examples and increasing iterations to reduce fluctuations.

Project 2: Camera Pose Estimation and Image Stitching

Implementaion and Learnings from Project 2:

Camera Pose Estimation using Homography
- In this task, camera pose estimation was performed using homography on a video, involving the following steps:
- Image Processing Pipeline:
  1. Read video frame by frame.
  2. Grayscale the image.
  3. Blur the image.
  4. Apply Thresholding to extract white color.
  5. Perform Canny edge detection.
  6. Use Hough transform algorithm on the frame.
  7. Find peaks in the Hough space.
  8. Draw lines corresponding to the Hough peaks.
  9. Find the intersections between the detected lines.
  10. Compute the homography matrix between the camera and the ground plane.
  11. Decompose the homography matrix to obtain rotation and translation.
- Explanation and Results:
  - The homography equation is used to describe the transformation between two images taken from different viewpoints. Steps involve Hough transformation for corner detection, homography computation, and homography decomposition.
  - Rotation
  - Translation
Image Stitching for Panoramic View
- This task focused on stitching four images together to create a panoramic view:
- Pipeline:
  1. Load the four input images.
  2. Convert images to grayscale.
  3. Extract features using ORB or SIFT.
  4. Match features using Brute-Force Matcher.
  5. Visualize matched features.
  6. Compute homographies between pairs of images.
  7. Combine images using computed homographies.
  8. Warp the second image onto the first using OpenCV.
  9. Repeat for the next pair until all four images are stitched.
  10. Save the final panoramic image.
- Explanation and Results:
  - The process involves feature extraction, feature matching, homography computation, and image blending. Homography is computed using RANSAC, and stitching involves warping and blending images.
  - Feature Matching and Images stiitching
- Problems Encountered:
1. Determining Canny edge detection values.
2. Difficulty in detecting edges without using built-in functions.
3. Tricky aspects in finding camera rotation and translation.
4. Challenges in stitching due to dimension mismatches and homography application.

Project 3: Camera Calibration

Implementation and Learnings from Project 3:

Camera Calibration: Mathematical Approach
- Pipeline:
  1. Capture checkerboard images for calibration.
  2. Determine world coordinates of checkerboard corners and find corresponding image coordinates.
  3. Calculate camera parameters using the P matrix.
  4. Extract the Rotation Matrix and Translation vector from the P matrix.
  5. Find Reprojection error for each point.
- Results:
  1. Minimum number of matching points needed is 6 for mathematical calibration.
  2. Mathematical formulation involves decomposing the P matrix and finding intrinsic matrix K, rotation matrix R, and translation vector T.
  3. Intrinsic Matrix K:
```
[-6.7912331e + 01, -7.9392768e − 02, 3.3562042e + 01;
 0, 6.7619034e + 01, 2.5845427e + 01;
 0, 0, 4.1946620e − 02]
```
  4. Projection matrix P:
```
[28.7364445 -1.75735415 -70.0687538 756.890519;
 -20.1369011 65.889012 -22.2140404 213.263797;
 -0.0277042391 -0.00259559759 -0.0313888009 1.00000000]
```
  5. Rotation matrix R.
```
[-0.74948643 0.11452983 -0.65203758;
 0.0453559 0.99149078 0.12202001;
 0.66046418 0.06187859 -0.74830349]
```
  6. Translation vector T:
```
[0.64862355;
 0.30183152;
 0.69751919;
 0.04064735]
```
  7. Reprojection errors: [0.2856, 0.9726, 1.0361, 0.4541, 0.1909, 0.3190, 0.1959, 0.3083]
Camera Calibration: Practical Approach
- The objective is to calibrate the camera using real-world images.
- Pipeline:
  1. Read calibration images.
  2. Grayscale and resize images.
  3. Find corners using cv2.findChessboardCorners().
  4. Draw corners on images.
  5. Calibrate using cv2.calibrateCamera() to obtain intrinsic parameters.
  6. Compute reprojection error for each image.
  7. Extract the camera matrix.
- Results:
  - Corners Detected
  - Corners detected in images, and reprojection errors:
    - Reprojection errors: [0.1198, 0.2610, 0.4094, 0.5418, 0.2219, 0.3537, 0.0520, 0.2247, 0.4810, 0.4042, 0.4810, 0.5137, 0.4297]
    - Intrinsic Matrix K:
      [2.2317e + 03, 0, 7.7812e + 02; 0, 2.4542e + 03, 1.3235e + 03; 0, 0, 1.0000]

3. Problems Encountered

Determining correct K matrix in the mathematical approach.
Handling very low values in the K matrix.

Project 4: Pipeline for Stereo Vision and Depth Perception

Implementation and Learnings from Project 4:

The fourth project in my perception course involved addressing four sub-tasks, each contributing to the overall goal of stereo vision:

Calibration Pipeline :
- Utilized ORB feature extraction to find matching features in stereo images.
- Estimated the Fundamental matrix and Essential matrix, considering camera intrinsics.
- Decomposed Essential matrix into translation and rotation.
Rectification Pipeline :
- Applied perspective transformation to rectify stereo images for easier comparison.
- Computed homography matrices to map original to rectified image coordinates.
- Visualized rectification effects through epipolar lines and feature points overlay.
Correspondence Pipeline :
- Implemented a correspondence pipeline involving matching windows and disparity calculation.
- Generated grayscale and color heat maps for visualizing disparity.
Image Depth Computation Pipeline :
- Calculated depth values from a disparity map, considering camera calibration parameters.
- Produced grayscale and color heat maps for depth visualization.

2. Results

The pipelines were applied to three datasets, yielding specific outcomes for each room:

Chess Room :
- Fundamental matrix and Essential matrix estimation.
- Visual representation of matched features.
  - Chess Room Matches
Ladder Room :
- Fundamental matrix and Essential matrix estimation.
- Visual representation of matched features.
  - Ladder Room Matches
Art Room :
- Fundamental matrix and Essential matrix estimation.
- Visual representation of matched features.
  - Art Room Matches
  - Art Room Depth map

3. Problems Encountered

Calibration Outliers :
- Difficulty in removing outliers during camera calibration.
- Tricky estimation of the Fundamental matrix.
Rectification Issues :
- Inability to achieve horizontal epipolar lines during rectification.
- Warping difficulties.
Correspondence Challenges :
- Issues arising from problems in the previous processes.
- Formulaic challenges in implementing correspondence.

Additional Resources

Course related resources

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
Midterm_exam		Midterm_exam
images		images
project1		project1
project2		project2
project3		project3
project4		project4
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ENPM673-Perception-for-Autonomous-Robots

📚 Course Overview

📄 Project List

Project 1: Object Tracking and Covariance Matrix, LS, TLS, and RANSAC implementaions for 3D Point Cloud

Implementation and Learnings from Project 1:

Ball tracking

Best Fit curve to predict trajectory in pixel coordinates

Standard Least squares fit

Total Least Squares fit

RANSAC fit

Project 2: Camera Pose Estimation and Image Stitching

Implementaion and Learnings from Project 2:

Rotation

Translation

Feature Matching and Images stiitching

Project 3: Camera Calibration

Implementation and Learnings from Project 3:

Corners Detected

3. Problems Encountered

Project 4: Pipeline for Stereo Vision and Depth Perception

Implementation and Learnings from Project 4:

2. Results

Chess Room Matches

Ladder Room Matches

Art Room Matches

Art Room Depth map

3. Problems Encountered

Additional Resources

About

Releases

Packages

Languages

Rishikesh-Jadhav/ENPM673-Perception-for-Autonomous-Robots

Folders and files

Latest commit

History

Repository files navigation

ENPM673-Perception-for-Autonomous-Robots

📚 Course Overview

📄 Project List

Project 1: Object Tracking and Covariance Matrix, LS, TLS, and RANSAC implementaions for 3D Point Cloud

Implementation and Learnings from Project 1:

Ball tracking

Best Fit curve to predict trajectory in pixel coordinates

Standard Least squares fit

Total Least Squares fit

RANSAC fit

Project 2: Camera Pose Estimation and Image Stitching

Implementaion and Learnings from Project 2:

Rotation

Translation

Feature Matching and Images stiitching

Project 3: Camera Calibration

Implementation and Learnings from Project 3:

Corners Detected

3. Problems Encountered

Project 4: Pipeline for Stereo Vision and Depth Perception

Implementation and Learnings from Project 4:

2. Results

Chess Room Matches

Ladder Room Matches

Art Room Matches

Art Room Depth map

3. Problems Encountered

Additional Resources

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages