Abstract
This paper focuses mainly on the problem of Chinese medical question answer matching, which is arguably more challenging than open-domain question answer matching in English due to the combination of its domain-restricted nature and the language-specific features of Chinese. We present an end-to-end character-level multi-scale convolutional neural framework in which character embeddings instead of word embeddings are used to avoid Chinese word segmentation in text preprocessing, and multi-scale convolutional neural networks (CNNs) are then introduced to extract contextual information from either question or answer sentences over different scales. The proposed framework can be trained with minimal human supervision and does not require any handcrafted features, rule-based patterns, or external resources. To validate our framework, we create a new text corpus, named cMedQA, by harvesting questions and answers from an online Chinese health and wellness community. The experimental results on the cMedQA dataset show that our framework significantly outperforms several strong baselines, and achieves an improvement of top-1 accuracy by up to 19%.
Funder
National Natural Science Foundation of China
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Cited by
50 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献