A practical guide for generating unsupervised, spectrogram-based latent space representations of animal vocalizations.

Mara Thomas Frants H Jensen Baptiste Averly Vlad Demartsev Marta B Manser Tim Sainburg Marie A Roch Ariana Strandburg-Peshkin

J Anim Ecol

Department for the Ecology of Animal Societies, Max Planck Institute of Animal Behavior, Constance, Germany.

Published: August 2022

The manual detection, analysis and classification of animal vocalizations in acoustic recordings is laborious and requires expert knowledge. Hence, there is a need for objective, generalizable methods that detect underlying patterns in these data, categorize sounds into distinct groups and quantify similarities between them. Among all computational methods that have been proposed to accomplish this, neighbourhood-based dimensionality reduction of spectrograms to produce a latent space representation of calls stands out for its conceptual simplicity and effectiveness. Goal of the study/what was done: Using a dataset of manually annotated meerkat Suricata suricatta vocalizations, we demonstrate how this method can be used to obtain meaningful latent space representations that reflect the established taxonomy of call types. We analyse strengths and weaknesses of the proposed approach, give recommendations for its usage and show application examples, such as the classification of ambiguous calls and the detection of mislabelled calls. What this means: All analyses are accompanied by example code to help researchers realize the potential of this method for the study of animal vocalizations.

Download full-text PDF	Source
http://dx.doi.org/10.1111/1365-2656.13754	DOI Listing

Publication Analysis

Top Keywords

latent space

animal vocalizations

space representations

practical guide

guide generating

generating unsupervised

unsupervised spectrogram-based

spectrogram-based latent

representations animal

vocalizations

Similar Publications

Diagnosing Helicobacter pylori using autoencoders and limited annotations through anomalous staining patterns in IHC whole slide images.

Int J Comput Assist Radiol Surg

January 2025

Comp. Sci. Dep, Universitat Autònoma de Barcelona, Campus UAB, Cerdanyola del Vallès, 08193, Catalunya, Spain.

Pau Cano Eva Musulen Debora Gil

Purpose: This work addresses the detection of Helicobacter pylori (H. pylori) in histological images with immunohistochemical staining. This analysis is a time-demanding task, currently done by an expert pathologist that visually inspects the samples.

View Article and Find Full Text PDF

Similar Publications

Addressing scalability and managing sparsity and dropout events in single-cell representation identification with ZIGACL.

Brief Bioinform

November 2024

School of Electrical Engineering and Automation, Hefei University of Technology, Hefei, Anhui, China.

Mingguang Shi Xuefeng Li

Despite significant advancements in single-cell representation learning, scalability and managing sparsity and dropout events continue to challenge the field as scRNA-seq datasets expand. While current computational tools struggle to maintain both efficiency and accuracy, the accurate connection of these dropout events to specific biological functions usually requires additional, complex experiments, often hampered by potential inaccuracies in cell-type annotation. To tackle these challenges, the Zero-Inflated Graph Attention Collaborative Learning (ZIGACL) method has been developed.

View Article and Find Full Text PDF

Similar Publications

Semi-supervised emotion-driven music generation model based on category-dispersed Gaussian Mixture Variational Autoencoders.

PLoS One

January 2025

Communication University of China, Nanjing, China.

Zihao Ning Xiao Han Jie Pan

Existing emotion-driven music generation models heavily rely on labeled data and lack interpretability and controllability of emotions. To address these limitations, a semi-supervised emotion-driven music generation model based on category-dispersed Gaussian mixture variational autoencoders is proposed. Initially, a controllable music generation model is introduced, which disentangles and manipulates rhythm and tonal features, enabling controlled music generation.

View Article and Find Full Text PDF

Similar Publications

Anomaly detection in virtual machine logs against irrelevant attribute interference.

PLoS One

January 2025

Shanghai Jiao Tong University, Shanghai, China.

Hao Zhang Yun Zhou Huahu Xu Jiangang Shi Xinhua Lin

Virtual machine logs are generated in large quantities. Virtual machine logs may contain some abnormal logs that indicate security risks or system failures of the virtual machine platform. Therefore, using unsupervised anomaly detection methods to identify abnormal logs is a meaningful task.

View Article and Find Full Text PDF

Similar Publications

Marked point process variational autoencoder with applications to unsorted spiking activities.

PLoS Comput Biol

December 2024

Communication Science Laboratories, NTT Corporation, Kyoto, Japan.

Ryohei Shibue Tomoharu Iwata

Spike train modeling across large neural populations is a powerful tool for understanding how neurons code information in a coordinated manner. Recent studies have employed marked point processes in neural population modeling. The marked point process is a stochastic process that generates a sequence of events with marks.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!