1. SSAST: Self-Supervised Audio Spectrogram Transformer
2. A four-stage data augmentation approach to resnet-conformer based acoustic modeling for sound event localization and detection;wang;CoRR,2021
3. FSD50K: An open dataset of human-labeled sound events;fonseca;IEEE/ACM TASLP,2021
4. Sound event localization and detection with pre-trained audio spectrogram transformer and multichannel separation network;scheibler;DCAS Workshop,2022