Author:
Ji Ge-Peng,Fan Deng-Ping,Fu Keren,Wu Zhe,Shen Jianbing,Shao Ling
Abstract
AbstractPrevious video object segmentation approaches mainly focus on simplex solutions linking appearance and motion, limiting effective feature collaboration between these two cues. In this work, we study a novel and efficient full-duplex strategy network (FSNet) to address this issue, by considering a better mutual restraint scheme linking motion and appearance allowing exploitation of cross-modal features from the fusion and decoding stage. Specifically, we introduce a relational cross-attention module (RCAM) to achieve bidirectional message propagation across embedding sub-spaces. To improve the model’s robustness and update inconsistent features from the spatiotemporal embeddings, we adopt a bidirectional purification module after the RCAM. Extensive experiments on five popular benchmarks show that ourFSNetis robust to various challenging scenarios (e.g., motion blur and occlusion), and compares well to leading methods both for video object segmentation and video salient object detection. The project is publicly available athttps://github.com/GewelsJI/FSNet.
Publisher
Springer Science and Business Media LLC
Subject
Artificial Intelligence,Computer Graphics and Computer-Aided Design,Computer Vision and Pattern Recognition
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献