1. A Pretrainer’s Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity;Longpre;arXiv,2023
2. Holistic evaluation of language models;Liang;Transactions on Machine Learning Research,2023
3. Taxonomy of Risks posed by Language Models;Weidinger,2022
4. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models;Gehman,2020
5. Extracting Training Data from Large Language Models;Carlini,2021