Two-microphone separation of speech mixtures.

IEEE Trans Neural Netw

Oticon A/S, Smørum DK-2765, Denmark.

Published: March 2008

Separation of speech mixtures, often referred to as the cocktail party problem, has been studied for decades. In many source separation tasks, the separation method is limited by the assumption of at least as many sensors as sources. Further, many methods require that the number of signals within the recorded mixtures be known in advance. In many real-world applications, these limitations are too restrictive. We propose a novel method for underdetermined blind source separation using an instantaneous mixing model which assumes closely spaced microphones. Two source separation techniques have been combined, independent component analysis (ICA) and binary time - frequency (T-F) masking. By estimating binary masks from the outputs of an ICA algorithm, it is possible in an iterative way to extract basis speech signals from a convolutive mixture. The basis signals are afterwards improved by grouping similar signals. Using two microphones, we can separate, in principle, an arbitrary number of mixed speech signals. We show separation results for mixtures with as many as seven speech signals under instantaneous conditions. We also show that the proposed method is applicable to segregate speech signals under reverberant conditions, and we compare our proposed method to another state-of-the-art algorithm. The number of source signals is not assumed to be known in advance and it is possible to maintain the extracted signals as stereo signals.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNN.2007.911740DOI Listing

Publication Analysis

Top Keywords

speech signals
16
source separation
12
signals
10
separation speech
8
speech mixtures
8
proposed method
8
speech
6
separation
6
two-microphone separation
4
mixtures
4

Similar Publications

Background/objectives: Parkinson's disease (PD) affects more than 6 million people worldwide. Its accurate diagnosis and monitoring are key factors to reduce its economic burden. Typical approaches consider either speech signals or video recordings of the face to automatically model abnormal patterns in PD patients.

View Article and Find Full Text PDF

Dynamic hydrogels have attracted considerable attention in the application of flexible electronics, as they possess injectable and self-healing abilities. However, it is still a challenge to combine high conductivity and antibacterial properties into dynamic hydrogels. In this work, we fabricated a type of dynamic hydrogel based on acylhydrazone bonds between thermo-responsive copolymer and silver nanoparticles (AgNPs) functionalized with hydrazide groups.

View Article and Find Full Text PDF

The extraction and analysis of pitch underpin speech and music recognition, sound segregation, and other auditory tasks. Perceptually, pitch can be represented as a helix composed of two factors: height monotonically aligns with frequency, while chroma cyclically repeats at doubled frequencies. Although the early perceptual and neurophysiological mechanisms for extracting pitch from acoustic signals have been extensively investigated, the equally essential subsequent stages that bridge to high-level auditory cognition remain less well understood.

View Article and Find Full Text PDF

Objective: The RSVP Keyboard is a non-implantable, event-related potential-based brain-computer interface (BCI) system designed to support communication access for people with severe speech and physical impairments. Here we introduce Inquiry Preview, a new RSVP Keyboard interface incorporating switch input for users with some voluntary motor function, and describe its effects on typing performance and other outcomes.

Approach: Four individuals with disabilities participated in the collaborative design of possible switch input applications for the RSVP Keyboard, leading to the development of Inquiry Preview and a method of fusing switch input with language model and electroencephalography (EEG) evidence for typing.

View Article and Find Full Text PDF

Music pre-processing methods are currently becoming a recognized area of research with the goal of making music more accessible to listeners with a hearing impairment. Our previous study showed that hearing-impaired listeners preferred spectrally manipulated multi-track mixes. Nevertheless, the acoustical basis of mixing for hearing-impaired listeners remains poorly understood.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!