Affiliation:
1. University of Electronic Science and Technology of China, Chengdu, P. R. China
Abstract
Because of the large-scale variation, counting in scenes of different densities is an extremely difficult task. In this paper, based on the attention mechanism, we propose a new self-weighted multi-scale fusion network structure named SMFNet to solve the problem of multi-scale changes and can significantly improve the effect of crowd counting in monitoring scene. The proposed SMFNet uses VGG as the backbone network to extract multi-scale features, uses a SMFNet as the neck to fuse multiple-scale features, and uses the atrous spatial pyramid pooling (ASPP) network and ordinary convolution as the head to generate both the attention map and the density map. The attention map highlighting crowd regions in the image contributes to a high-quality density map, and the density map records the crowd distribution. The number of crowd in the image can be obtained by summing the pixel values of the density map. We conduct experiments on three crowd counting datasets and one vehicle counting dataset to show that our proposed SMFNet can improve the state-of-the-art counting methods.
Publisher
World Scientific Pub Co Pte Ltd
Subject
Artificial Intelligence,Computer Vision and Pattern Recognition,Software