This is an implementation of tracklet clustering for 2D tracking.
Note that this SCT method has been upgraded to TrackletNet Tracker (TNT). The corresponding paper on arXiv is here. The source code (training + testing) is provided here.
In SCT, the loss function in our data association algorithm consists of motion, temporal and appearance attributes. Especially, a histogram-based adaptive appearance model is designed to encode long-term appearance change. The change of loss is incorporated with a bottom-up clustering strategy for the association of tracklets. Robust 2D-to-3D projection is achieved with EDA optimization applied to camera calibration for speed estimation.
Please refer to Gaoang Wang's GitHub repository, where detailed instructions of configuration and installation are provided.
For input detection results in text, the format of each line is as follows:
<frame_id>,-1,<xmin>,<ymin>,<width>,<height>,<confidence>,-1,-1,-1,<class>
This is similar to the required format of MOTChallenge. The frame ID is 0-based. The confidence is in percentage.
For output 2D tracking results in text, the format of each line is as follows:
<frame_id>,<obj_id>,<xmin>,<ymin>,<width>,<height>,<confidence>,-1,-1,-1,<class>
This is similar to the required format of MOTChallenge. The frame ID and object ID are both 0-based. The confidence is in percentage.
For any question you can contact Zheng (Thomas) Tang.