1. Leak, cheat, repeat: data contamination and evaluation malpractices in closed-source llms;Balloccu,2024
2. Pythia: a suite for analyzing large language models across training and scaling;Biderman,2023
3. Language models are few-shot learners;Brown,2020
4. Fast {K}rippendorff: fast computation of {K}rippendorff’s alpha agreement measure;Castro,2017
5. LangChain;Chase,2022