Author:
Pipanmekaporn Luepol,Kamonsantiroj Suwatchai,Ratanavilisagul Chiabwoot,Prasomphan Sathit
Abstract
Landmark retrieval, which aims to search for landmark images similar to a query photo within a massive image database, has received considerable attention for many years. Despite this, finding landmarks quickly and accurately still presents some unique challenges. To tackle these challenges, we present a deep learning model, called the Spatial-Pyramid Attention network (SPA). This network is an end-to-end convolutional network, incorporating a spatial-pyramid attention layer that encodes the input image, leveraging the spatial pyramid structure to highlight regional features based on their relative spatial distinctiveness. An image descriptor is then generated by aggregating these regional features. According to our experiments on benchmark datasets including Oxford5k, Paris6k, and Landmark-100, our proposed model, SPA, achieves mean Average Precision (mAP) accuracy of 85.3% with the Oxford dataset, 89.6% with the Paris dataset, and 80.4% in the Landmark-100 dataset, outperforming existing state-of-theart deep image retrieval models.
Subject
Computer Graphics and Computer-Aided Design,Computer Science Applications,Computer Vision and Pattern Recognition
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献