1. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics;kendall;Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2018
2. Residual mixture of experts;wu;ArXiv Preprint,2022
3. Adaptive Mixtures of Local Experts
4. Branched multi-task networks: deciding what layers to share;vandenhende;ArXiv Preprint,2019
5. {GS} hard: Scaling giant models with conditional computation and automatic sharding;lepikhin;International Conference on Learning Representations,2021