1. Look, Listen and Learn
2. Objects that Sound
3. Yusuf Aytar , Carl Vondrick , and Antonio Torralba . 2016 . Soundnet: Learning sound representations from unlabeled video. Advances in neural information processing systems , Vol. 29 (2016). Yusuf Aytar, Carl Vondrick, and Antonio Torralba. 2016. Soundnet: Learning sound representations from unlabeled video. Advances in neural information processing systems , Vol. 29 (2016).
4. David A Bulkin and Jennifer M Groh . 2006. Seeing sounds: visual and auditory interactions in the brain. Current opinion in neurobiology , Vol. 16 , 4 ( 2006 ), 415--419. David A Bulkin and Jennifer M Groh. 2006. Seeing sounds: visual and auditory interactions in the brain. Current opinion in neurobiology , Vol. 16, 4 (2006), 415--419.
5. Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning