Abstract
Address matching, which aims to match an input descriptive address with a standard address in an address database, is a key technology for achieving data spatialization. The construction of today’s smart cities depends heavily on the precise matching of Chinese addresses. Existing methods that rely on rules or text similarity struggle when dealing with nonstandard address data. Deep-learning-based methods often require extracting address semantics for embedded representation, which not only complicates the matching process, but also affects the understanding of address semantics. Inspired by deep transfer learning, we introduce an address matching approach based on a pretraining fine-tuning model to identify semantic similarities between various addresses. We first pretrain the address corpus to enable the address semantic model (abbreviated as ASM) to learn address contexts unsupervised. We then build a labelled address matching dataset using an address-specific geographical feature, allowing the matching problem to be converted into a binary classification prediction problem. Finally, we fine-tune the ASM using the address matching dataset and compare the output with several popular address matching methods. The results demonstrate that our model achieves the best performance, with precision, recall, and an F1 score above 0.98.
Funder
National Natural Science Foundation of China
Scientific Research Fund of Zhejiang Provincial Education Department
Natural Science Foundation of Zhejiang Province
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献