1. Heated-up softmax embedding;zhang;CoRR,2018
2. Attention is all you need;vaswani,2017
3. Adam: A method for stochastic optimization;kingma;ICLR Procs,2015
4. L2-constrained softmax loss for discriminative face verification;ranjan;CoRR,2017
5. Learning transferable visual models from natural language supervision;radford,2021