Triple critical feature capture network: A triple critical feature capture network for weakly supervised object detection-Reference-Cited by-同舟云学术

Triple critical feature capture network: A triple critical feature capture network for weakly supervised object detection

Published:2023-05-15 Issue:8 Volume:17 Page:895-912
ISSN:1751-9632
Container-title:IET Computer Vision
language:en
Short-container-title:IET Computer Vision

Author:

Liu Zhoufeng¹^ORCID,Wang Kaihua¹,Li Chunlei¹^ORCID,Ding Shunmin²,Xi Jiangtao³

Affiliation:

1. School of Electronic and Information Engineering Zhongyuan University of Technology Zhengzhou China

2. Department of Energy and Environment Zhongyuan University of Technology Zhengzhou China

3. School of Electrical, Computer and Telecommunications Engineering University of Wollongong Wollongong New South Wales Australia

Abstract

AbstractWeakly supervised object detection (WSOD) is becoming increasingly important for computer vision tasks, as it alleviates the burden of manual annotation. Most WSOD techniques rely on multiple instance learning (MIL), which tends to localise the discriminative parts of salient objects instead of the whole object. In addition, network training is often supervised using simple image‐level annotations, without including object quantities or location information. However, this can lead to ambiguous differentiation of object instances, both in terms of location and semantics. To address these issues, propose an end‐to‐end triple critical feature capture network (TCFCNet) for WSOD is proposed. Specifically, a multi‐task branch, which can perform fully supervised classification and regression task, was integrated with a PCL in an end‐to‐end network for refining object locations in an online method. A cyclic parametric dropblock module (CPDM) was then designed to help the detector focus on the contextual information by using cyclic masking techniques to maximise the removal of the discriminative components of an object instance to alleviate the part domination problem. Finally, a feature decoupling module (FDM) is proposed to further reduce the ambiguous distinction of object instances by adaptively constructing robust critical features that adapt to multi‐task branch for classification and regression tasks, which contains a feature enhancement module and task‐specific polarisation functions. Comprehensive experiments are carried out on the challenging Pascal VOC 2007 and VOC 2012 datasets. The proposed method achieves a 54.6% mAP and a 44.3% mAP on the Pascal VOC 2007 and VOC 2012 datasets respectively, showed that our method outperformed existing mainstream techniques by a considerable margin.

Publisher

Institution of Engineering and Technology (IET)

Subject

Computer Vision and Pattern Recognition,Software

Reference64 articles.

1. EfficientDet: Scalable and Efficient Object Detection

2. Thuan D.:Evolution of Yolo Algorithm and Yolov5: The State‐Of‐The‐Art Object Detention Algorithm(2021)

3. High-Quality R-CNN Object Detection Using Multi-Path Detection Calibration Network

4. BBC Net: Bounding-Box Critic Network for Occlusion-Robust Object Detection

5. Ge Z. et al.:Yolox: Exceeding Yolo Series in 2021(2021).arXiv preprint arXiv:2107.08430