1. Andrea Agostinelli , Timo I Denk , Zalán Borsos , Jesse Engel , Mauro Verzetti , An-toine Caillon, Qingqing Huang , Aren Jansen , Adam Roberts , Marco Tagliasacchi , et al. Musiclm: Generating music from text. arXiv preprint arXiv:2301.11325 , 2023 . Andrea Agostinelli, Timo I Denk, Zalán Borsos, Jesse Engel, Mauro Verzetti, An-toine Caillon, Qingqing Huang, Aren Jansen, Adam Roberts, Marco Tagliasacchi, et al. Musiclm: Generating music from text. arXiv preprint arXiv:2301.11325, 2023.
2. Nanxin Chen , Yu Zhang , Heiga Zen , Ron J Weiss , Mohammad Norouzi , and William Chan . Wavegrad: Estimating gradients for waveform generation. arXiv preprint arXiv:2009.00713 , 2020 . Nanxin Chen, Yu Zhang, Heiga Zen, Ron J Weiss, Mohammad Norouzi, and William Chan. Wavegrad: Estimating gradients for waveform generation. arXiv preprint arXiv:2009.00713, 2020.
3. Hyung Won Chung Le Hou Shayne Longpre Barret Zoph Yi Tay William Fedus Eric Li Xuezhi Wang Mostafa Dehghani Siddhartha Brahma Albert Webson Shixiang Shane Gu Zhuyun Dai Mirac Suzgun Xinyun Chen Aakanksha Chowdhery Sharan Narang Gaurav Mishra Adams Yu Vincent Zhao Yanping Huang Andrew Dai Hongkun Yu Slav Petrov Ed H. Chi Jeff Dean Jacob Devlin Adam Roberts Denny Zhou Quoc V. Le and Jason Wei. Scaling instruction-finetuned language models 2022. URL https://arxiv.org/abs/2210.11416. Hyung Won Chung Le Hou Shayne Longpre Barret Zoph Yi Tay William Fedus Eric Li Xuezhi Wang Mostafa Dehghani Siddhartha Brahma Albert Webson Shixiang Shane Gu Zhuyun Dai Mirac Suzgun Xinyun Chen Aakanksha Chowdhery Sharan Narang Gaurav Mishra Adams Yu Vincent Zhao Yanping Huang Andrew Dai Hongkun Yu Slav Petrov Ed H. Chi Jeff Dean Jacob Devlin Adam Roberts Denny Zhou Quoc V. Le and Jason Wei. Scaling instruction-finetuned language models 2022. URL https://arxiv.org/abs/2210.11416.
4. Damai Dai , Yutao Sun , Li Dong , Yaru Hao , Zhifang Sui , and Furu Wei . Why can gpt learn in-context? language models secretly perform gradient descent as meta-optimizers. ArXiv, abs/2212.10559 , 2022 . Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Zhifang Sui, and Furu Wei. Why can gpt learn in-context? language models secretly perform gradient descent as meta-optimizers. ArXiv, abs/2212.10559, 2022.
5. Audio Set: An ontology and human-labeled dataset for audio events