Many relevant sound events occur in urban scenarios, and robust classification models are required to identify abnormal and relevant events correctly. These models need to identify such events within valuable time, being effective and prompt. It is also essential to determine for how much time these events prevail. This article presents an extensive analysis developed to identify the best-performing model to successfully classify a broad set of sound events occurring in urban scenarios. Analysis and modelling of Transformer models were performed using available public datasets with different sets of sound classes. The Transformer models' performance was compared to the one achieved by the baseline model and end-to-end convolutional models. Furthermore, the benefits of using pre-training from image and sound domains and data augmentation techniques were identified. Additionally, complementary methods that have been used to improve the models' performance and good practices to obtain robust sound classification models were investigated. After an extensive evaluation, it was found that the most promising results were obtained by employing a Transformer model using a novel Adam optimizer with weight decay and transfer learning from the audio domain by reusing the weights from AudioSet, which led to an accuracy score of 89.8% for the UrbanSound8K dataset, 95.8% for the ESC-50 dataset, and 99% for the ESC-10 dataset, respectively.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9699161 | PMC |
http://dx.doi.org/10.3390/s22228874 | DOI Listing |
Distributed acoustic sensing (DAS) is a technology that uses optical fiber as a sensing unit to detect external vibration signals. Due to the high resolution and high sensitivity of DAS, it has great application potential in the detection of vibration events. However, high detection performance will bring limitations to DAS in multi-source detection.
View Article and Find Full Text PDFPhysiol Meas
January 2025
Nanchang University, 1st Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, 330031, CHINA.
Background And Objective: In contrast to respiratory sound classification, respiratory phase and adventitious sound event detection provides more detailed and accurate respiratory information, which is clinically important for respiratory disorders. However, current respiratory sound event detection models mainly use convolutional neural networks to generate frame-level predictions. A significant drawback of the frame-based model lies in its pursuit of optimal frame-level predictions rather than the best event-level ones.
View Article and Find Full Text PDFAnal Chem
January 2025
ICGM, Univ. Montpellier, CNRS, ENSCM, 34000 Montpellier, France.
In this contribution, we apply our newly developed ball-milling platform, which combines Raman spectroscopy and thermal (IR) imaging, as well as acoustic and high-speed optical video recordings, to the synthesis and transformation of citric acid-isonicotinamide (1:2) cocrystal polymorphs in transparent PMMA jars. Particularly, we demonstrate how Raman, temperature, acoustic, and video data are complementary and enable detection and connection of chemical and physical events happening during ball-milling in a time-resolved manner. Importantly, we show that the formation of the three cocrystal polymorphs can be detected through acoustic analyses solely.
View Article and Find Full Text PDFJ Fish Dis
January 2025
Cawthron Institute, Nelson, New Zealand.
Intracellular, free-floating and biofilm-forming bacterial pathogens have been implicated in summer mortality of farmed Chinook salmon, Oncorhynchus tshawytscha, in New Zealand. A mortality event in 2022 in the Pelorus Sound, Marlborough, was linked to high water temperatures (> 18°C), and bacterial skin disease associated with Piscirickettsia spp. (=Rickettsia-like organisms) and Tenacibaculum species.
View Article and Find Full Text PDFNeuroimage
January 2025
Department of Computer Science, University of Innsbruck, Technikerstrasse 21a, Innsbruck, 6020, Austria. Electronic address:
The objective of this study is to assess the potential of a transformer-based deep learning approach applied to event-related brain potentials (ERPs) derived from electroencephalographic (EEG) data. Traditional methods involve averaging the EEG signal of multiple trials to extract valuable neural signals from the high noise content of EEG data. However, this averaging technique may conceal relevant information.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!