1. Attention is all you need;Vaswani,2017
2. Large pre-trained language models contain human-like biases of what is right and wrong to do;Schramowski;Nat Mach Intell,2022
3. Bert: Pre-training of deep bidirectional transformers for language understanding;Devlin;arXiv preprint,2018
4. A literature review on bidirectional encoder representations from transformers;Shreyashree,2022
5. Attention is not all you need: the complicated case of ethically using large language models in healthcare and medicine;Harrer;EBioMedicine,2023