Affiliation:
1. School of Information Engineering East China Jiaotong University Nanchang China
Abstract
AbstractSmall object detection remains a bottleneck because there is little visual information about them, especially in the deep layers. To improve the detection performance of small objects, here, Swin Transformer is introduced as the model backbone network to extract rich features of small objects. Then, a multilevel receptive field expansion network (MRFENet) is proposed based on the characteristics of different stages in the Swin Transformer. Specifically, a receptive field expansion block (RFEB) is designed to acquire contextual cues and extract detailed information. The RFEB is carefully designed to target the required receptive fields of different layers and further refine the features. MRFENet combined with RFEBs implements the retention of small object context cues and the acquisition of receptive fields for the adaptive detection tasks. Finally, a union loss function is designed to enhance the localization ability. Experiments on the MS COCO dataset demonstrate that the proposed MRFENet has a significant improvement against other state‐of‐the‐art methods, which further validates that MRFENet can effectively utilize small object information.
Funder
Natural Science Foundation of Jiangxi Province
Education Department of Jiangxi Province
Publisher
Institution of Engineering and Technology (IET)
Subject
Electrical and Electronic Engineering,Computer Vision and Pattern Recognition,Signal Processing,Software
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献