1. Chen, S. (2013). Text extraction using regular expressions. Digitization in the Humanities Workshop. Rice University.
2. Chen, S. P., Che, Q., Ling, C., Schäfer, D., & Wang, H. (2017). Treating a genre as a database: the Chinese local gazetteers, the LG tools, and research based on this new digital methodology. In Lewis, R., Raynor, C., Forest, D., Sinatra, M., and Sinclair, S., editors, Digital Humanities 2017: conference abstracts (pp. 53–54). McGill University & Université de Montréal.
3. Cheng, N., Li, B., Xiao, L., Xu, C., Ge, S., Hao, X., & Feng, M. (2020). Integration of automatic sentence segmentation and lexical analysis of Ancient Chinese based on BiLSTM-CRF model. In Proceedings of LT4HALA 2020–1st Workshop on Language Technologies for Historical and Ancient Languages (pp. 52–58). European Language Resources Association (ELRA).
4. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), (pp 4171–4186). Association for Computational Linguistics.
5. Fuller, M. A. (2020). The China biographical database user’s guide. https://projects.iq.harvard.edu/files/chinesecbdb/files/cbdb_users_guide.pdf