Affiliation:
1. Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China
Abstract
Information in non-standard address texts in Chinese is usually presented with rough content, complex and diverse presentation forms, and inconsistent hierarchical granularity, causing low accuracy in Chinese address parsing. Therefore, we propose a method for parsing non-standard address text in Chinese that integrates the Chinese Toponym Named Entity Recognition (CHTopoNER) model and a dynamic finite state machine (FSM). First, named entity recognition is performed by the CHTopoNER model. Sets of dynamic FSMs are then constructed based on the address hierarchical characteristics to sort and combine the Chinese address elements, thereby achieving address parsing on the Chinese internet. This method showed excellent accuracy in parsing both standard and non-standard placename addresses. In particular, this method performed better in address parsing for disordered or missing hierarchical elements than traditional methods using an FSM. Specifically, this method achieved accuracies of 96.6% and 96.8% for standard and non-standard placenames, respectively. These accuracies increased by 8.0% and 57.1%, respectively, compared with the integrated CHTopoNER model and traditional FSM, and by 7.4% and 19.8%, respectively, compared with the integrated CHTopoNER model and bidirectional FSM. After analysis, the address-parsing method showed good scalability and adaptability, which could be applied to various types of address-parsing tasks.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference55 articles.
1. Tian, Q., Ren, F., Hu, T., Liu, J., Li, R., and Du, Q. (2021). Using an optimized Chinese address matching method to develop a geocoding service: A case study of Shenzhen, China. ISPRS Int. J. Geo Inf., 5.
2. The Chinese address extraction method based on the address tree model;Kang;J. Surv. Mapp.,2015
3. Automated geocoding of textual documents: A survey of current approaches;Melo;Trans. GIS,2017
4. Spatial pattern analysis of address quality: A study on the impact of rapid urban expansion in China;Lin;Environ. Plan. B Urb. Anal. City Sci.,2021
5. Geoscience keyphrase extraction algorithm using enhanced word embedding;Qiu;Expert Syst. Appl.,2019