CAPNet: Context and Attribute Perception for Pedestrian Detection-Reference-Cited by-同舟云学术

CAPNet: Context and Attribute Perception for Pedestrian Detection

Published:2023-04-10 Issue:8 Volume:12 Page:1781
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Zhu Yueyan¹^ORCID,Huang Hai¹²^ORCID,Yu Huayan¹^ORCID,Chen Aoran¹^ORCID,Zhao Guanliang¹^ORCID

Affiliation:

1. School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China

2. Key Laboratory of Interactive Technology and Experience System, Ministry of Culture and Tourism, Beijing University of Posts and Telecommunications, Beijing 100876, China

Abstract

With a focus on practical applications in the real world, a number of challenges impede the progress of pedestrian detection. Scale variance, cluttered backgrounds and ambiguous pedestrian features are the main culprits of detection failures. According to existing studies, consistent feature fusion, semantic context mining and inherent pedestrian attributes seem to be feasible solutions. In this paper, to tackle the prevalent problems of pedestrian detection, we propose an anchor-free pedestrian detector, named context and attribute perception (CAPNet). In particular, we first generate features with consistent well-defined semantics and local details by introducing a feature extraction module with a multi-stage and parallel-stream structure. Then, a global feature mining and aggregation (GFMA) network is proposed to implicitly reconfigure, reassign and aggregate features so as to suppress irrelevant features in the background. At last, in order to bring more heuristic rules to the network, we improve the detection head with an attribute-guided multiple receptive field (AMRF) module, leveraging the pedestrian shape as an attribute to guide learning. Experimental results demonstrate that introducing the context and attribute perception greatly facilitates detection. As a result, CAPNet achieves new state-of-the-art performance on Caltech and CityPersons datasets.

Funder

National Key R&D Program of China

BUPT innovation and entrepreneurship support program

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/12/8/1781/pdf

Reference52 articles.

1. Li, P., Chen, X., and Shen, S. (2019, January 15–20). Stereo R-CNN based 3D Object Detection for Autonomous Driving. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.

2. Tseng, B.L., Lin, C.Y., and Smith, J.R. (2002, January 26–29). Real-time video surveillance for traffic monitoring using virtual line analysis. Proceedings of the IEEE International Conference on Multimedia and Expo, Lausanne, Switzerland.

3. Bergmann, P., Meinhardt, T., and Leal-Taixe, L. (November, January 27). Tracking without bells and whistles. Proceedings of the IEEE International Conference on Computer Vision, Seoul, Republic of Korea.

4. Zhang, L., Lin, L., Liang, X., and He, K. (2016, January 11–14). Is Faster R-CNN Doing Well for Pedestrian Detection?. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.

5. Zhang, S., and Li, S.Z. (2018, January 8–14). Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd. Proceedings of the European Conference on Computer Vision, Munich, Germany.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Reparameterized dilated architecture: A wider field of view for pedestrian detection;Applied Intelligence;2024-01