1. Yoshua Bengio. 2013. Estimating or propagating gradients through stochastic neurons. arXiv preprint arXiv:1305.2982 (2013).
2. Kaifeng Bi, Changping Hu, Lingxi Xie, Xin Chen, Longhui Wei, and Qi Tian. 2019. Stabilizing darts with amended gradient estimation on architectural parameters. arXiv preprint arXiv:1910.11831 (2019).
3. Kaifeng Bi, Lingxi Xie, Xin Chen, Longhui Wei, and Qi Tian. 2020. Gold-nas: Gradual, one-level, differentiable. arXiv preprint arXiv:2007.03331 (2020).
4. Han Cai, Ligeng Zhu, and Song Han. 2018. Proxylessnas: Direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332 (2018).
5. Multitask Learning: A Knowledge-Based Source of Inductive Bias