Intelligibility of ideal binary masked noisy speech was measured on a group of normal hearing individuals across mixture signal to noise ratio (SNR) levels, masker types, and local criteria for forming the binary mask. The binary mask is computed from time-frequency decompositions of target and masker signals using two different schemes: an ideal binary mask computed by thresholding the local SNR within time-frequency units and a target binary mask computed by comparing the local target energy against the long-term average speech spectrum. By depicting intelligibility scores as a function of the difference between mixture SNR and local SNR threshold, alignment of the performance curves is obtained for a large range of mixture SNR levels.
View Article and Find Full Text PDFIdeal binary time-frequency masking is a signal separation technique that retains mixture energy in time-frequency units where local signal-to-noise ratio exceeds a certain threshold and rejects mixture energy in other time-frequency units. Two experiments were designed to assess the effects of ideal binary masking on speech intelligibility of both normal-hearing (NH) and hearing-impaired (HI) listeners in different kinds of background interference. The results from Experiment 1 demonstrate that ideal binary masking leads to substantial reductions in speech-reception threshold for both NH and HI listeners, and the reduction is greater in a cafeteria background than in a speech-shaped noise.
View Article and Find Full Text PDFFor a given mixture of speech and noise, an ideal binary time-frequency mask is constructed by comparing speech energy and noise energy within local time-frequency units. It is observed that listeners achieve nearly perfect speech recognition from gated noise with binary gains prescribed by the ideal binary mask. Only 16 filter channels and a frame rate of 100 Hz are sufficient for high intelligibility.
View Article and Find Full Text PDFSeparation of speech mixtures, often referred to as the cocktail party problem, has been studied for decades. In many source separation tasks, the separation method is limited by the assumption of at least as many sensors as sources. Further, many methods require that the number of signals within the recorded mixtures be known in advance.
View Article and Find Full Text PDFWe introduce a data-analysis framework and performance metrics for evaluating and optimizing the interaction between activation tasks, experimental designs, and the methodological choices and tools for data acquisition, preprocessing, data analysis, and extraction of statistical parametric maps (SPMs). Our NPAIRS (nonparametric prediction, activation, influence, and reproducibility resampling) framework provides an alternative to simulations and ROC curves by using real PET and fMRI data sets to examine the relationship between prediction accuracy and the signal-to-noise ratios (SNRs) associated with reproducible SPMs. Using cross-validation resampling we plot training-test set predictions of the experimental design variables (e.
View Article and Find Full Text PDF