Affiliation:
1. Department of Computer Engineering, Chosun University, Gwangju 61452, Republic of Korea
2. Department of Future & Smart Construction Research, Korea Institute of Civil Engineering and Building Technology (KICT), Goyang-si 10223, Republic of Korea
Abstract
In this paper, we propose a method for estimating the classes and directions of static audio objects using stereo microphones in a drone environment. Drones are being increasingly used across various fields, with the integration of sensors such as cameras and microphones, broadening their scope of application. Therefore, we suggest a method that attaches stereo microphones to drones for the detection and direction estimation of specific emergency monitoring. Specifically, the proposed neural network is configured to estimate fixed-size audio predictions and employs bipartite matching loss for comparison with actual audio objects. To train the proposed network structure, we built an audio dataset related to speech and drones in an outdoor environment. The proposed technique for identifying and localizing sound events, based on the bipartite matching loss we proposed, works better than those of the other teams in our group.
Funder
Ministry of Land, Infrastructure and Transport