Affiliation:
1. Harbin Institute of Technology, Harbin, China, P.C.
Abstract
Sequence labeling is a core task in natural language processing. The maximum entropy Markov model (MEMM) is a powerful tool in performing this task. This article enhances the traditional MEMM by exploiting the positional information of language elements. The stationary hypothesis is relaxed in MEMM, and the nonstationary MEMM (NS-MEMM) is proposed. Several related issues are discussed in detail, including the representation of positional information, NS-MEMM implementation, smoothing techniques, and the space complexity issue. Furthermore, the asymmetric NS-MEMM presents a more flexible way to exploit positional information. In the experiments, NS-MEMM is evaluated on both the Chinese and the English pos-tagging tasks. According to the experimental results, NS-MEMM yields effective improvements over MEMM by exploiting positional information. The smoothing techniques in this article effectively solve the NS-MEMM data-sparseness problem; the asymmetric NS-MEMM is also an improvement by exploiting positional information in a more flexible way.
Publisher
Association for Computing Machinery (ACM)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献