In this paper, we introduce an extension of the image method for generating room impulse responses in a structure with more than a single confined space, namely, the structure image method (StIM). The proposed method, StIM, can efficiently generate a large number of environmental examples for a structure impulse response, which is required by current deep-learning methods for many tasks, while maintaining low computational complexity. We address the integration of the environment representation, produced by StIM, into the training process, and present a framework for training deep models. We demonstrate the usage of StIM when training an audio classification model and testing with real recordings acquired by accessible day-to-day devices. StIM shows promising results for indoors audio classification, where the target sound source is not located in the same room as the microphones. StIM enables large scale simulations of multi-room acoustics with low computational complexity which is mostly beneficial for training of deep learning networks.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1121/10.0006781 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!