1. Probing classifiers: Promises, shortcomings, and advances;Belinkov;Computational Linguistics,2022
2. Sparks of artificial general intelligence: Early experiments with GPT-4;Bubeck,2023
3. Amnesic probing: Behavioral explanation with amnesic counterfactuals;Elazar;Transactions of the Association for Computational Linguisticss,2021
4. A mathematical framework for transformer circuits;Elhage,2021
5. Activity–weight duality in feed-forward neural networks reveals two co-determinants for generalization;Feng;Nature Machine Intelligence,2023