1. Devlin, J., Chang, M.W., Lee, K., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
2. Radford, A., Narasimhan, K., Salimans, T., et al.: Improving language understanding by generative pre-training (2018)
3. Lewis, M., Liu, Y., Goyal, N., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)
4. Zeng, Q., Xiong, W., Du, J., et al.: Named entity recognition in electronic health records using BiLSTM-CRF with self-attention. J. Comput. Appl. Softw. 38(03), 159–162+242 (2021)
5. Jiang, S., Zhao, S., Hou, K., et al.: A BERT-BiLSTM-CRF model for Chinese electronic medical records named entity recognition. In: 2019 12th International Conference on Intelligent Computation Technology and Automation (ICICTA), pp. 166–169. IEEE (2019)