Abstract
2D image-based 3D shape retrieval has become a hot research topic since its wide industrial applications and academic significance. However, existing view-based 3D shape retrieval methods are restricted by two settings, 1) learn the common-class features while neglecting the instance visual characteristics, 2) narrow the global domain variations while ignoring the local semantic variations in each category. To overcome these problems, we propose a novel hierarchical instance feature alignment (HIFA) method for this task. HIFA consists of two modules, cross-modal instance feature learning and hierarchical instance feature alignment. Specifically, we first use CNN to extract both 2D image and multi-view features. Then, we maximize the mutual information between the input data and the high-level feature to preserve as much as visual characteristics of an individual instance. To mix up the features in two domains, we enforce feature alignment considering both global domain and local semantic levels. By narrowing the global domain variations we impose the identical large norm restriction on both 2D and 3D feature-norm expectations to facilitate more transferable possibility. By narrowing the local variations we propose to minimize the distance between two centroids of the same class from different domains to obtain semantic consistency. Extensive experiments on two popular and novel datasets, MI3DOR and MI3DOR-2, validate the superiority of HIFA for 2D image-based 3D shape retrieval task.
Publisher
International Joint Conferences on Artificial Intelligence Organization
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献