1. Alsentzer, E., Murphy, J.R., Boag, W., Weng, W.H., Jin, D., Naumann, T., & McDermott, M. (2019). Publicly available clinical BERT embeddings. arXiv Preprint arXiv:1904.03323.
2. Argamon, S., Koppel, M., Pennebaker, J.W., & Schler, J. (2007). Mining the blogosphere: age, gender and the varieties of self-expression. First Monday.
3. Bajaj, P., Xiong, C., Ke, G., Liu, X., He, D., Tiwary, S., Liu, T.Y., Bennett, P., Song, X., & Gao, J. (2022). METRO: efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals (arXiv:2204.06644). arXiv. 10.48550/arXiv.2204.06644.
4. A neural probabilistic language model;Bengio;J. Mach. Learn Res.,2003
5. Using natural language processing to understand people and culture;Berger;Am. Psychol.,2021