The manual detection, analysis and classification of animal vocalizations in acoustic recordings is laborious and requires expert knowledge. Hence, there is a need for objective, generalizable methods that detect underlying patterns in these data, categorize sounds into distinct groups and quantify similarities between them. Among all computational methods that have been proposed to accomplish this, neighbourhood-based dimensionality reduction of spectrograms to produce a latent space representation of calls stands out for its conceptual simplicity and effectiveness. Goal of the study/what was done: Using a dataset of manually annotated meerkat Suricata suricatta vocalizations, we demonstrate how this method can be used to obtain meaningful latent space representations that reflect the established taxonomy of call types. We analyse strengths and weaknesses of the proposed approach, give recommendations for its usage and show application examples, such as the classification of ambiguous calls and the detection of mislabelled calls. What this means: All analyses are accompanied by example code to help researchers realize the potential of this method for the study of animal vocalizations.

Download full-text PDF

Source
http://dx.doi.org/10.1111/1365-2656.13754DOI Listing

Publication Analysis

Top Keywords

latent space
12
animal vocalizations
12
space representations
8
practical guide
4
guide generating
4
generating unsupervised
4
unsupervised spectrogram-based
4
spectrogram-based latent
4
representations animal
4
vocalizations
4

Similar Publications

Purpose: This work addresses the detection of Helicobacter pylori (H. pylori) in histological images with immunohistochemical staining. This analysis is a time-demanding task, currently done by an expert pathologist that visually inspects the samples.

View Article and Find Full Text PDF

Despite significant advancements in single-cell representation learning, scalability and managing sparsity and dropout events continue to challenge the field as scRNA-seq datasets expand. While current computational tools struggle to maintain both efficiency and accuracy, the accurate connection of these dropout events to specific biological functions usually requires additional, complex experiments, often hampered by potential inaccuracies in cell-type annotation. To tackle these challenges, the Zero-Inflated Graph Attention Collaborative Learning (ZIGACL) method has been developed.

View Article and Find Full Text PDF

Existing emotion-driven music generation models heavily rely on labeled data and lack interpretability and controllability of emotions. To address these limitations, a semi-supervised emotion-driven music generation model based on category-dispersed Gaussian mixture variational autoencoders is proposed. Initially, a controllable music generation model is introduced, which disentangles and manipulates rhythm and tonal features, enabling controlled music generation.

View Article and Find Full Text PDF

Virtual machine logs are generated in large quantities. Virtual machine logs may contain some abnormal logs that indicate security risks or system failures of the virtual machine platform. Therefore, using unsupervised anomaly detection methods to identify abnormal logs is a meaningful task.

View Article and Find Full Text PDF

Marked point process variational autoencoder with applications to unsorted spiking activities.

PLoS Comput Biol

December 2024

Communication Science Laboratories, NTT Corporation, Kyoto, Japan.

Spike train modeling across large neural populations is a powerful tool for understanding how neurons code information in a coordinated manner. Recent studies have employed marked point processes in neural population modeling. The marked point process is a stochastic process that generates a sequence of events with marks.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!