New Era for Geo-Parsing to Obtain Actual Locations: A Novel Toponym Correction Method Based on Remote Sensing Images

Author:

Wang ShuORCID,Yan XinrongORCID,Zhu Yunqiang,Song Jia,Sun KaiORCID,Li WeirongORCID,Hu Lei,Qi Yanmin,Xu Huiyao

Abstract

Geo-parsing, one of the key components of geographical information retrieval, is a process to recognize and geo-locate toponyms mentioned in texts. Such a process can obtain locations contained in toponyms successfully with consistent updating of neural network models and multiple contextual features. The significant offset distance between the geo-parsed locations and the actual occurrence locations still remains. This is because the geo-parsed locations sourced from toponyms in texts always point to the centers of cities, counties, or towns, and cannot directly represent the actual occurrence locations such as factories, farms, and activity areas. Consequently, The significant offset distances between the geo-parsed locations and the actual occurrence locations limit text mining applications in micro-scale geographic discoveries. This research aims at decreasing offset distances of geo-parsed locations by proposing a novel Toponym Correction Method based on satellite Remote Sensing Images (TC-RSI). The TC-RSI method uses satellite remote sensing images to provide extra detailed spatial information that can be associated with the sentence toponym by corresponding attributes. The TC-RSI method was validated in a case study of the forest ecological pattern dataset of An’hui province from visual, statistical, and robustness assessments. The correction results show that the TC-RSI method dramatically decreases the offset distances from about 50 km to about 1 km and promotes geographical discoveries on smaller scales. A series of analyses indicated that the TC-RSI is a valid, effective, and promising method to improve the accuracy of geo-parsed locations, which allows text mining to find more accurate geographical discoveries with lower offset distances. Moreover, toponym correction promotes the use of more diverse spatial data sources, such as Lidar, domain gazetteers, Wikimedia, and streetscapes, which are expected to usher in a new era of geo-parsing with toponym corrections.

Funder

National Natural Science Foundation of China

Strategic Priority Research Program of the Chinese Academy of Sciences

Publisher

MDPI AG

Subject

General Earth and Planetary Sciences

Reference50 articles.

1. Geographic Information Retrieval: Progress and Challenges in Spatial Search of Text

2. Are we there yet? evaluating state-of-the-art neural network based geoparsers using EUPEG as a benchmarking platform;Wang;Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Geospatial Humanities,2019

3. Geo-semantic-parsing: AI-powered geoparsing by traversing semantic knowledge graphs

4. Unsupervised word embeddings capture latent knowledge from materials science literature

5. Text-mining tool seeks out ‘hidden data’

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3