Affiliation:
1. School of Telecommunications Engineering Xidian University Xian Shaanxi China
2. Science and Technology on Electro‐optic Control Laboratory Xian Henan China
Abstract
AbstractAccording to the fact that single‐modal data usually contain limited information, a great deal of effort has been devoted to making use of the complementary information contained in the multi‐modal data on various patterns. Thus, this paper is concerned with an object detection method that can fully utilize multi‐modal data. First, the method introduces the transformer mechanism to realize the fusion of intra‐modal and inter‐modal features of different modal data. The aim is to take advantage of the complementarity of data between modalities, which helps to improve the performance of multi‐modal object detection. Second, a contrastive loss suitable for contrastive learning is applied. This enables the authors to effectively utilize label information. Extensive experiments are conducted on multiple object detection datasets to demonstrate the effectiveness of our proposed method.
Funder
Aeronautical Science Foundation of China
Publisher
Institution of Engineering and Technology (IET)
Subject
Electrical and Electronic Engineering,Computer Vision and Pattern Recognition,Signal Processing,Software
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献