Self-Supervised Open-Set Speaker Recognition with Laguerre-Voronoi Descriptors.

Sensors (Basel)

Department of Computer Science, University of Calgary, Calgary, AB T2N1N4, Canada.

Published: March 2024

Speaker recognition is a challenging problem in behavioral biometrics that has been rigorously investigated over the last decade. Although numerous supervised closed-set systems inherit the power of deep neural networks, limited studies have been made on open-set speaker recognition. This paper proposes a self-supervised open-set speaker recognition that leverages the geometric properties of speaker distribution for accurate and robust speaker verification. The proposed framework consists of a deep neural network incorporating a wider viewpoint of temporal speech features and Laguerre-Voronoi diagram-based speech feature extraction. The deep neural network is trained with a specialized clustering criterion that only requires positive pairs during training. The experiments validated that the proposed system outperformed current state-of-the-art methods in open-set speaker recognition and cluster representation.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10975617	PMC
http://dx.doi.org/10.3390/s24061996	DOI Listing

Publication Analysis

Top Keywords

speaker recognition

open-set speaker

deep neural

self-supervised open-set

neural network

speaker

recognition

recognition laguerre-voronoi

laguerre-voronoi descriptors

descriptors speaker

Similar Publications

Cancer metastasis to the upper gastrointestinal tract-a case series.

J Gastrointest Oncol

December 2024

Medical Oncology Department, The Canberra Hospital, Canberra, ACT, Australia.

Mohamed Bakry Hasib Ahmadzai Mitali Fadia Geoffrey Peters Yada Kanjanapan

Background: Metastasis of non-gastrointestinal (non-GI) cancers to the upper GI tract is a rare occurrence, with limited cases reported in the literature. Recognising this type of metastasis is crucial, as it presents unique diagnostic and therapeutic challenges. This case series adds to the literature by discussing seven rare cases of non-GI cancer metastasising to the upper GI tract, emphasising the complications and clinical manifestations.

View Article and Find Full Text PDF

Similar Publications

Joint speech and text machine translation for up to 100 languages.

Nature

January 2025

Creating the Babel Fish, a tool that helps individuals translate speech between any two languages, requires advanced technological innovation and linguistic expertise. Although conventional speech-to-speech translation systems composed of multiple subsystems performing translation in a cascaded fashion exist, scalable and high-performing unified systems remain underexplored. To address this gap, here we introduce SEAMLESSM4T-Massively Multilingual and Multimodal Machine Translation-a single model that supports speech-to-speech translation (101 to 36 languages), speech-to-text translation (from 101 to 96 languages), text-to-speech translation (from 96 to 36 languages), text-to-text translation (96 languages) and automatic speech recognition (96 languages).

View Article and Find Full Text PDF

Similar Publications

Longitudinal visual outcomes in idiopathic intracranial hypertension: the role of early prognostic indicators and risk stratification in disease management.

J Neurol

January 2025

Department of Neuroscience, Central Clinical School, Faculty of Medicine, Nursing and Health Science, Monash University, Melbourne, VIC, Australia.

Blake D Colman Paul G Sanfilippo Anthony Fok Minh Ngoc Le Nguyen Raghuvir Kini

Background: Idiopathic intracranial hypertension (IIH) is increasingly prevalent, yet longitudinal outcome data are scarce. This study aimed to characterise demographic and longitudinal clinical changes in a cohort of patients with IIH.

Methods: Retrospective cohort analysis on adult patients diagnosed with IIH (Friedman criteria) enrolled in the neuro-ophthalmology database (NODE) across two tertiary centres.

View Article and Find Full Text PDF

Similar Publications

A comprehensive voice dataset for Hindko digit recognition.

Data Brief

February 2025

Department of Information & Communication Technology, University of Agder (UiA), Norway.

Tanveer Ahmed Maqbool Khan Khalil Khan Ikram Syed Syed Sajid Ullah

Hindko is a language primarily spoken in Northwestern areas of Pakistan. Approximately eight million people speak the Hindko language. According to its native speakers, it is 7 largest language of Pakistan and 2 largest language of Khyber Pakhtunkhwa.

View Article and Find Full Text PDF

Similar Publications

Development and Beta Validation of an mHealth-Based Hearing Screener (SRESHT) for Young Children in Resource-Limited Countries: Pilot Validation Study.

JMIR Form Res

January 2025

Faculty of Audiology and Speech Language Pathology, Sri Ramachandra Institute of Higher Education and Research, Chennai, India.

Vidya Ramkumar Deepashree Joshi B Anil Prabhakar James W Hall Ramya Vaidyanath

Background: The prevalence of hearing loss in infants in India varies between 4 and 5 per 1000. Objective-based otoacoustic emissions and auditory brainstem response have been used in high-income countries for establishing early hearing screening and intervention programs. Nevertheless, the use of objective screening tests in low- and middle-income countries (LMICs) such as India is not feasible.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!