Affiliation:
1. Department of Computer Science, National University of Technology, Pakistan
2. Department of Computer Science and Software Engineering, University of Ha’il, Saudi Arabia
3. Department of Computer Science, Norwegian University of Science and Technology, Norway
Abstract
Head detection in real-world videos is a classical research problem in computer vision. Head detection in videos is challenging than in a single image due to many nuisances that are commonly observed in natural videos, including arbitrary poses, appearances, and scales. Generally, head detection is treated as a particular case of object detection in a single image. However, the performance of object detectors deteriorates in unconstrained videos. In this paper, we propose a temporal consistency model (TCM) to enhance the performance of a generic object detector by integrating spatial-temporal information that exists among subsequent frames of a particular video. Generally, our model takes detection from a generic detector as input and improves mean average precision (mAP) by recovering missed detection and suppressing false positives. We compare and evaluate the proposed framework on four challenging datasets, i.e., HollywoodHeads, Casablanca, BOSS, and PAMELA. Experimental evaluation shows that the performance is improved by employing the proposed TCM model. We demonstrate both qualitatively and quantitatively that our proposed framework obtains significant improvements over other methods.
Subject
Electrical and Electronic Engineering,Instrumentation,Control and Systems Engineering
Reference49 articles.
1. On detection, data association and segmentation for multi-target tracking;Y. Tian;IEEE Transactions on Pattern Analysis and Machine Intelligence,2018
2. A directed sparse graphical model for multi-target tracking;M. Ullah
3. Real-world anomaly detection in surveillance videos;W. Sultani
4. Real-time anomaly detection in dense crowded scenes
5. Congestion detection in pedestrian crowds using oscillation in motion trajectories
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Self-supervised Animal Detection in Indoor Environment;2023 Twelfth International Conference on Image Processing Theory, Tools and Applications (IPTA);2023-10-16
2. CTL-NET: Deep Learning Network for Cattle Teat Length Trait Analysis;2023 11th European Workshop on Visual Information Processing (EUVIP);2023-09-11
3. Wild Animal Species Classification from Camera Traps Using Metadata Analysis;2023 11th European Workshop on Visual Information Processing (EUVIP);2023-09-11
4. A fusion framework for vision-based indoor occupancy estimation;Building and Environment;2022-11
5. BreastUS: Vision Transformer for Breast Cancer Classification Using Breast Ultrasound Images;2022 16th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS);2022-10