The speed of tracking-by-detection (TBD) greatly depends on the number of running a detector because the detection is the most expensive operation in TBD. In many practical cases, multi-object tracking (MOT) can be, however, achieved based tracking-by-motion (TBM) only. This is a possible solution without much loss of MOT accuracy when the variations of object cardinality and motions are not much within consecutive frames. Therefore, the MOT problem can be transformed to find the best TBD and TBM mechanism. To achieve it, we propose a novel decision coordinator for MOT (Decode-MOT) which can determine the best TBD/TBM mechanism according to scene and tracking contexts. In specific, our Decode-MOT learns tracking and scene contextual similarities between frames. Because the contextual similarities can vary significantly according to the used trackers and tracking scenes, we learn the Decode-MOT via self-supervision. The evaluation results on MOT challenge datasets prove that our method can boost the tracking speed greatly while keeping the state-of-the-art MOT accuracy. Our code will be available at https://github.com/reussite-cv/Decode-MOT.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2023.3298538DOI Listing

Publication Analysis

Top Keywords

mot accuracy
8
contextual similarities
8
mot
6
tracking
5
decode-mot
4
decode-mot hurdle
4
hurdle frames
4
frames tracking-by-detection?
4
tracking-by-detection? speed
4
speed tracking-by-detection
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!