Affiliation:
1. Department of Information Science and Engineering, Xinjiang University, Urumqi 830000, China
2. Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
Abstract
Sound source localization and detection (SSLD) is a joint task of identifying the presence of individual sound events and locating the sound sources in space. However, due to the diversity of sound events and the variability of sound source location, SSLD becomes a tough task. In this paper, we propose a SSLD method based on a multi-scale densely connection (MDC) mechanism and a residual attention (RA) mechanism. We design a MDC block to integrate the information from a very local to exponentially enlarged receptive field within the block. We also explored three kinds of RA blocks that can facilitate the conductivity of information flow among different layers by continuously adding feature maps from the previous layers to the next layer. In order to recalibrate the feature maps after convolutional operation, we design a dual-path attention (DPA) unit that is largely embodied in MDC and RA blocks. We firstly verified the effectiveness of the MDC block, RA block, and DPA unit, respectively. We then compared our proposed method with another four methods on the development dataset; finally, with SELDnet and SELD-TCN on another five datasets, experimental results show the generalization of our proposed method.
Funder
National Natural Science Foundation of China
Funds for Creative Research Groups of Higher Education of Xinjiang
Tianshan Innovation Team Plan Project of Xinjiang
Publisher
Acoustical Society of America (ASA)
Subject
Acoustics and Ultrasonics,Arts and Humanities (miscellaneous)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献