Affiliation:
1. The State Key Laboratory for Novel Software Technology Nanjing University Nanjing China
Abstract
AbstractLearning the full extent of pixel‐level instance response in a weakly supervised manner remains unsatisfactory. Peak response maps (PRMs) localizes the discriminative object regions but cannot provide complete instance information, suffering from incomplete segmentation and unreliable mask prediction by noisy proposal retrieval. This work tackles this challenging problem by mining diverse class peak responses that include more discriminative and complete object regions and retrieving more reliable proposals from noisy segment proposal galleries. First, the existing method is enhanced with two more classification branches, thus contributing to more diverse and abundant instance regions from peak response maps. The mined class peak responses from two of the branches are then merged to generate more complete peak response maps by a clustering approach in their deep feature space. Then, instance segmentation masks are retrieved from a noisy object segment proposal gallery with class confidence, which is calculated by a normal classifier to obtain cleaner mask prediction. Finally, the pseudo‐supervision can be used to train an instance segmentation network in a fully supervised manner. Experiments on the PASCAL VOC 2012 dataset and COCO dataset show that the approach works effectively and outperforms other counterparts by a margin of more than 6 %, 4%, and 3% with the mean average precision (mAP) at IoU threshold of 0.25, 0.5 and 0.75, respectively.
Funder
Fundamental Research Funds for the Central Universities
Publisher
Institution of Engineering and Technology (IET)