1. Long short-term memory;Hochreiter;Neural Computation,1997
2. Conditional random fields: Probabilistic models for segmenting and labeling sequence data;Lafferty,2001
3. Neural reranking for named entity recognition;Yang,2017
4. Neural models for sequence chunking;Zhai,2017
5. J. Devlin, M. W. Chang and K. Lee, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv: 1810.04805, 2018.