1. Luís B Almeida, Thibault Langlois, José D Amaral, and Alexander Plakhov. 1998. Parameter adaptation in stochastic optimization. In On-Line Learning in Neural Networks. Cambridge University Press, 111–134.
2. Atilim Gunes Baydin Robert Cornish David Martínez-Rubio Mark Schmidt and Frank Wood. 2018. Online Learning Rate Adaptation with Hypergradient Descent. In ICLR.
3. Paul N Bennett Ryen W White Wei Chu Susan T Dumais Peter Bailey Fedor Borisyuk and Xiaoyuan Cui. 2012. Modeling the impact of short-and long-term behavior on search personalization. In SIGIR.
4. Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George Bm Van Den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, et al. 2022. Improving language models by retrieving from trillions of tokens. In ICML.
5. Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. In NeurIPS.