Abstract
Image similarity measurement is a fundamental problem in the field of computer vision. It is widely used in image classification, object detection, image retrieval, and other fields, mostly through Siamese or triplet networks. These networks consist of two or three identical branches of convolutional neural network (CNN) and share their weights to obtain the high-level image feature representations so that similar images are mapped close to each other in the feature space, and dissimilar image pairs are mapped far from each other. Especially, the triplet network is known as the state-of-the-art method on image similarity measurement. However, the basic CNN can only handle fixed-size images. If we obtain a fixed size image via cutting or scaling, the information of the image will be lost and the recognition accuracy will be reduced. To solve the problem, this paper has proposed the triplet spatial pyramid pooling network (TSPP-Net) through combing the triplet convolution neural network with the spatial pyramid pooling. Additionally, we propose an improved triplet loss function, so that the network model can realize twice distance learning by only inputting three samples at one time. Through the theoretical analysis and experiments, it is proved that the TSPP-Net model and the improved triple loss function can improve the generalization ability and the accuracy of image similarity measurement algorithm.
Funder
National Natural Science Foundation of China
the Fundamental Research Funds for the Central Universities of Central South University
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献