Individual Report

Introduction

This individual report aims to

outline my experience with the project,
clearly describe the tasks I undertook throughout the project,
and identify the challenges I encountered and explain how I addressed them.

Part 1: Single Object Tracking with Kalman (Centroid-Tracker)

Objective

Implement object tracking using a pre-existing object detection algorithm and integrate the Kalman Filter for smooth and accurate tracking.

Tasks

The main tasks I undertook for this part are the following:

Implement a KalmanFilter class with predict and update methods.
Implement an ObjectTracker class to manage object tracking.
Implement a Visualizer class to visualize the tracking results.

`KalmanFilter`

The KalmanFilter class is responsible for estimating and predicting the state of a moving object based on noisy measurements.

Attributes

Name	Description
`dt`	Time for one cycle used to estimate state (sampling rate)
`u`	Accelerations in the x and y directions
`x`	State vector representing the object's position and velocity
`A`	State transition matrix
`B`	Control matrix
`H`	Measurement mapping matrix
`P`	State covariance matrix
`Q`	Process noise covariance matrix
`R`	Measurement noise covariance matrix

Methods

predict projects the current state estimate forward in time to predict the next state.
update uses a new measurement to adjust the state estimate.

`ObjectTracker`

The ObjectTracker class is responsible for managing detection information as well as Kalman Filter estimation and prediction for the tracked object.

Attributes

Name	Description
`kalman_filter`	Instance of `KalmanFilter` to use at each time step
`detect`	Function to call to detect the object in a frame
`state`	Current object detection, estimation and prediction

Methods

The class has a single step method called at each frame responsible for

detecting the object in the given frame using self.detect,
predicting the object's state using self.kalman_filter,
and updating the self.kalman_filter state.

`Visualizer`

The Visualizer class is responsible for drawing object state, prediction and estimation on the target frame to visualize tracking results.

Attributes

The class has a single attribute tracker which is an instance of ObjectTracker to extract state information from.

Methods

The class has a single public method show responsible for drawing

the object detection as a green circle,
the object state prediction as a blue bounding box,
and the object state estimation as a red bounding box.

Challenges

KalmanFilter
- Understand the shape of data to replicate the equations seen in class.
- Beware to use a column vector instead of a line vector for the state vector x.
Visualizer
- Understand the image format used by default (i.e. BGR instead of RGB).
- Using OpenCV to read a video frame by frame.

Part 2: IOU-based Tracking (Bounding-Box Tracker)

Objective

Develop a Simple IoU-based tracker and extend it for multiple object tracking.

Tasks

The main tasks I undertook for this part are the following:

Implement a bbox module to manipulate bounding boxes.
Implement a track module to manage and export tracks.
Implement a visualize module to handle visualization and video exports.

`bbox`

The bbox module is composed of the following:

A BoundingBox dataclass representing a bounding box in the detection CSV.
An intersection_over_union function to compute the IoU of two bounding boxes.

`BoundingBox`

Dataclass which represents a bounding box by its top-left and bottom-right points.
The properties width and height return the bounding box's dimensions along each axis.
The property area computes the area of the bounding box.

`intersection_over_union`

Function which takes 2 bounding boxes as input and returns the ratio of their intersection over their union.

`track`

The track module is composed of the following:

A Track dataclass representing a track in the CSV.
A TrackManager class to manage the tracks across video frames.
A TrackHistory class to handle exporting tracks to a file.

`Track`

Dataclass responsible for storing track information: frame number, ID and bounding box.

`TrackManager`

Class responsible for creating, updating and deleting tracks across the frames of a video.

Attributes

Name	Description
`tracks`	List of current tracks.
`next_track_id`	Object ID for the next upcoming track.

Methods

The class has a single public method step called at each frame responsible for

matching detections with tracks,
updating matched tracks,
creating new tracks for unmatched detections,
and deleting unmatched tracks.

Algorithm

The manager algorithm can be described as follows:

Match the tracks with the detections.
- Compute a similarity matrix where $c_{i,j}$ is the IoU between the $i$-th track and $j$-th detection.
- Compute associations using the Hungarian algorithm with scipy.optimize.linear_sum_assignment.
Update the matched tracks.
- Get the track with the corresponding ID.
- Replace its bounding box with the associated detection.
- Include the track in the new tracks list.
Create new tracks for unmatched detections.
- Create new track based on the detection with self.next_track_id as its ID.
- Increment self.next_track_id to ensure unique IDs across following tracks.
Replace the self.tracks current tracks with the new tracks list.
- Any previous tracks not included is lost (i.e. unmatched tracks are removed).

`TrackHistory`

Class responsible for storing and exporting tracks.

Methods

dump: Dumps the tracks to a CSV file.
extend: Stores a list of tracks.
push: Stores a track.

`visualize`

The visualize module is composed of the following:

A Canvas class to draw on a given frame.
A Visualizer class to visualize tracking results.
A Video class responsible for tracking results video exports.

`Canvas`

Class responsible for drawing on a given frame.

Attributes

Name	Description
`data`	Frame on which to draw.

Methods

draw_bbox: Draws a given bounding box.
draw_text: Draws some text.

`Visualizer`

Class responsible for drawing bounding boxes and IDs of tracks.

Methods

draw: Draws the current tracks stored by a track manager on the given frame.
show: Opens a window to visualize the current frame (possibly with track drawings).

`Video`

Class responsible for simplifying the use of cv2.VideoWriter.

Methods

write: Creates an instance of cv2.VideoWriter if necessary and writes the given frame.
release: Releases the cv2.VideoWriter instance.

Challenges

bbox: Handle the case where bounding boxes have negative intersections.
- The dimensions are set to zero whenever the intersections are negative.
TrackManager
- Understanding the scipy.optimize.linear_sum_assignment function.
- Understanding how to know when detections or tracks are unmatched.

Part 3: Kalman-Guided IoU Tracking (Bounding-Box Tracker)

Objective

Extend IoU-based MOT with Hungarian Algorithm by adding Kalman Filter.

Tasks

The main tasks I undertook for this part are the following:

Adapt the BoundingBox class to compute its centroid.
Adapt the KalmanFilter class to handle bounding boxes through their centroids.
Adapt the TrackManager class to use Kalman Filter predictions.
Adapt the visualize module to handle visualization of predicted centroids.

`BoundingBox`

Add a property center which computes the centroid of the bounding box based on its coordinates.

`KalmanFilter`

Change the update method to take in a BoundingBox and use its center property for predictions.

`TrackManager`

Add a filter_params argument to the constructor to initialize the kalman filters of new tracks.
Add a filters dictionary attribute which associates a track ID with its corresponding KalmanFilter instance.
Change the step method to account for creating and updating the KalmanFilter instances of each track.
Use predictions from the corresponding KalmanFilter instances instead of the last known detection bounding boxes for computing the similarity score between tracks and detections.

`visualize`

Add a draw_cross function to the Canvas class in order to draw a X shape at a given point.
Adapt the Visualizer class to draw a cross at the corresponding centroids predicted by the KalmanFilter instances for the current tracks.

Challenges

As this part consists mostly of adapting the existing pipeline, I did not really encounter any major challenges apart from correctly translating coordinates between centroids and bounding box corner points.

Part 4: Appearance-Aware IoU-Kalman Object Tracker

Objective

Extend IoU-Kalman tracker to include object re-identification (ReID).

Tasks

The main tasks I undertook for this part are the following:

Implement a reident class to compute appearance features of a patch.
Adapt the TrackManager to account for appearance features of patches.

`reident`

This module is composed of the following:

An ObjectIdentifier class responsible for computing appearance features of a patch.
An extract_patch function to extract a patch from a frame.
A normalize_patch function to normalize a given patch.

`ObjectIdentifier`

Class responsible for computing appearance features using a lightweight pre-trained model.

Attributes

Name	Description
`feature_extractor`	Sequential layer of MobileNet v2 excluding the classifier head.
`device`	Device on which to run the computation.

Methods

The class has a single public method __call__ called on a frame and a bounding box responsible for:

Extracting the corresponding patch through extract_patch.
Normalizing the corresponding patch through normalize_patch.
Computing the appearance features using the feature_extractor attribute.

`extract_patch`

This function is responsible for cropping a frame to the given bounding box.

`normalize_patch`

This function is responsible for preprocessing the patch by

resizing the input patch to 224 x 224,
Converting the colors from BGR to RGB,
and normalizing the values for the model using a Z-score.

`TrackManager`

Add an identifier attribute and constructor argument to store an ObjectIdentifier instance.
Add a track_features dictionary attributes which associates a track ID with its feature vector.
Add a weights attribute which provides weights for IoU and appearance features scores.
Change the step method to account for updating the feature vectors of each track.
Use weighted sum between IoU score and appearance score for computing the similarity score between tracks and detections.

Challenges

Correctly extracting the feature layers of MobileNet.
- I used to only extract the features of the last convolution layer which led to poor results.
- I got much better results by leveraging the features of the last dense layer before the classifier head.
Finding the best distance function.
- I tried different distances empirically and cosine similarity seemed to perform better in my case.

Part 5: Appearance-Aware IoU-Kalman Object Tracker: Detector Extension

Objective

Integrate more efficient lightweight deep Learning-based object detector for pedestrian detection.

Tasks

The main tasks I undertook for this part are the following:

Adapt the Detector script to generate the detections CSV using a lightweight YOLO model.

`Detector`

The class LightweightDetector is responsible for loading a lightweight YOLO model and use it to infer bounding boxes of the "person" class for a given frame.

Attributes

Name	Description
`frames_dir`	The root directory of input frames.
`model`	The YOLO model to use.

Methods

The only method is predict which takes in the index of a frame and does the following:

Loads the corresponding frame from the self.frames_dir directory.
Runs inference on the frame using the model stored in self.model.
Converts the resulting bounding boxes to our own BoundingBox class and return them.

The predict method can then be called for each frame to generate a detection CSV file.

Challenges

Understand how to import and use YOLO models.
- To do so, I have learnt to use the Ultralytics framework through its official documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src		src
.gitignore		.gitignore
README.md		README.md
Report.pdf		Report.pdf
pyproject.toml		pyproject.toml
tracking_results.mp4		tracking_results.mp4
uv.lock		uv.lock

BinaryAlien/MLVOT

Folders and files

Latest commit

History

Repository files navigation

Individual Report

Introduction

Part 1: Single Object Tracking with Kalman (Centroid-Tracker)

Objective

Tasks

KalmanFilter

ObjectTracker

Visualizer

Challenges

Part 2: IOU-based Tracking (Bounding-Box Tracker)

Objective

Tasks

bbox

BoundingBox

intersection_over_union

track

Track

TrackManager

TrackHistory

visualize

Canvas

Visualizer

Video

Challenges

Part 3: Kalman-Guided IoU Tracking (Bounding-Box Tracker)

Objective

Tasks

BoundingBox

KalmanFilter

TrackManager

visualize

Challenges

Part 4: Appearance-Aware IoU-Kalman Object Tracker

Objective

Tasks

reident

ObjectIdentifier

extract_patch

normalize_patch

TrackManager

Challenges

Part 5: Appearance-Aware IoU-Kalman Object Tracker: Detector Extension

Objective

Tasks

Detector

Challenges

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`KalmanFilter`

`ObjectTracker`

`Visualizer`

`bbox`

`BoundingBox`

`intersection_over_union`

`track`

`Track`

`TrackManager`

`TrackHistory`

`visualize`

`Canvas`

`Visualizer`

`Video`

`BoundingBox`

`KalmanFilter`

`TrackManager`

`visualize`

`reident`

`ObjectIdentifier`

`extract_patch`

`normalize_patch`

`TrackManager`

`Detector`

Packages