1. 2018. Digital archive mobile performances (DAMP). https://ccrma.stanford.edu/damp/publications/ 2018. Digital archive mobile performances (DAMP). https://ccrma.stanford.edu/damp/publications/
2. Rosana Ardila , Megan Branson , Kelly Davis , Michael Henretty , Michael Kohler , Josh Meyer , Reuben Morais , Lindsay Saunders , Francis M Tyers , and Gregor Weber . 2019. Common voice: A massively-multilingual speech corpus. arXiv preprint arXiv:1912.06670 ( 2019 ). Rosana Ardila, Megan Branson, Kelly Davis, Michael Henretty, Michael Kohler, Josh Meyer, Reuben Morais, Lindsay Saunders, Francis M Tyers, and Gregor Weber. 2019. Common voice: A massively-multilingual speech corpus. arXiv preprint arXiv:1912.06670 (2019).
3. Alexei Baevski and Abdelrahman Mohamed . 2020 . Effectiveness of self-supervised pre-training for ASR . In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 7694–7698 . Alexei Baevski and Abdelrahman Mohamed. 2020. Effectiveness of self-supervised pre-training for ASR. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 7694–7698.
4. Alexei Baevski , Yuhao Zhou , Abdelrahman Mohamed , and Michael Auli . 2020. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations . In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates , Inc ., 12449–12460. https://proceedings.neurips.cc/paper/ 2020 /file/92d1e1eb1cd6f9fba3227870bb6d7f07-Paper.pdf Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, and Michael Auli. 2020. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 12449–12460. https://proceedings.neurips.cc/paper/2020/file/92d1e1eb1cd6f9fba3227870bb6d7f07-Paper.pdf
5. Deep Speaker Embeddings for Short-Duration Speaker Verification