Publications by authors named "Elmar Noth"

Magnetic Resonance Imaging (MRI) allows analyzing speech production by capturing high-resolution images of the dynamic processes in the vocal tract. In clinical applications, combining MRI with synchronized speech recordings leads to improved patient outcomes, especially if a phonological-based approach is used for assessment. However, when audio signals are unavailable, the recognition accuracy of sounds is decreased when using only MRI data.

View Article and Find Full Text PDF

Acoustic identification of vocalizing individuals opens up new and deeper insights into animal communications, such as individual-/group-specific dialects, turn-taking events, and dialogs. However, establishing an association between an individual animal and its emitted signal is usually non-trivial, especially for animals underwater. Consequently, a collection of marine species-, array-, and position-specific ground truth localization data is extremely challenging, which strongly limits possibilities to evaluate localization methods beforehand or at all.

View Article and Find Full Text PDF

Bioacoustic research spans a wide range of biological questions and applications, relying on identification of target species or smaller acoustic units, such as distinct call types. However, manually identifying the signal of interest is time-intensive, error-prone, and becomes unfeasible with large data volumes. Therefore, machine-driven algorithms are increasingly applied to various bioacoustic signal identification challenges.

View Article and Find Full Text PDF

Purpose: The aim of this study was to investigate the speech prosody of postlingually deaf cochlear implant (CI) users compared with control speakers without hearing or speech impairment.

Method: Speech recordings of 74 CI users (37 males and 37 females) and 72 age-balanced control speakers (36 males and 36 females) are considered. All participants are German native speakers and read (The North Wind and the Sun), a standard text in pathological speech analysis and phonetic transcriptions.

View Article and Find Full Text PDF

Biometric identification techniques such as photo-identification require an array of unique natural markings to identify individuals. From 1975 to present, Bigg's killer whales have been photo-identified along the west coast of North America, resulting in one of the largest and longest-running cetacean photo-identification datasets. However, data maintenance and analysis are extremely time and resource consuming.

View Article and Find Full Text PDF

Background: Dysarthric symptoms in Parkinson's disease (PD) vary greatly across cohorts. Abundant research suggests that such heterogeneity could reflect subject-level and task-related cognitive factors. However, the interplay of these variables during motor speech remains underexplored, let alone by administering validated materials to carefully matched samples with varying cognitive profiles and combining automated tools with machine learning methods.

View Article and Find Full Text PDF

Background: Imprecise articulation has a negative impact on speech intelligibility. Therefore, treatment of articulation is clinically relevant in patients with dysarthria. In order to be effective and according to the principles of motor learning, articulation therapy needs to be intensive, well organized, with adequate feedback and requires frequent practice.

View Article and Find Full Text PDF

Embodied cognition research on Parkinson's disease (PD) points to disruptions of frontostriatal language functions as sensitive targets for clinical assessment. However, no existing approach has been tested for crosslinguistic validity, let alone by combining naturalistic tasks with machine-learning tools. To address these issues, we conducted the first classifier-based examination of morphological processing (a core frontostriatal function) in spontaneous monologues from PD patients across three typologically different languages.

View Article and Find Full Text PDF

This paper introduces , a mobile application for motor evaluation and monitoring of Parkinson's disease patients. The App is based on previously reported methods, for instance, the evaluation of articulation and pronunciation in speech, regularity and freezing of gait in walking, and tapping accuracy in hand movement. Preliminary experiments indicate that most of the measurements are suitable to discriminate patients and controls.

View Article and Find Full Text PDF

This study presents an approach to Parkinson's disease detection using vowels with sustained phonation and a ResNet architecture dedicated originally to image classification. We calculated spectrum of the audio recordings and used them as an image input to the ResNet architecture pre-trained using the ImageNet and SVD databases. To prevent overfitting the dataset was strongly augmented in the time domain.

View Article and Find Full Text PDF

Large bioacoustic archives of wild animals are an important source to identify reappearing communication patterns, which can then be related to recurring behavioral patterns to advance the current understanding of intra-specific communication of non-human animals. A main challenge remains that most large-scale bioacoustic archives contain only a small percentage of animal vocalizations and a large amount of environmental noise, which makes it extremely difficult to manually retrieve sufficient vocalizations for further analysis - particularly important for species with advanced social systems and complex vocalizations. In this study deep neural networks were trained on 11,509 killer whale (Orcinus orca) signals and 34,848 noise segments.

View Article and Find Full Text PDF

Parkinson's disease is a neurodegenerative disorder characterized by a variety of motor symptoms. Particularly, difficulties to start/stop movements have been observed in patients. From a technical/diagnostic point of view, these movement changes can be assessed by modeling the transitions between voiced and unvoiced segments in speech, the movement when the patient starts or stops a new stroke in handwriting, or the movement when the patient starts or stops the walking process.

View Article and Find Full Text PDF

Purpose: The aim of the study was to address the reported inconsistencies in the relationship between objective acoustic measures and perceptual ratings of vocal quality.

Method: This tutorial moves away from the more widely examined problems related to obtaining the perceptual ratings and the acoustic measures and centers in less scrutinized issues regarding the procedure to establish the correspondence. Expressions for the most common measure of association between perceptual and acoustic measures (Pearson's r) are derived using a multiple linear regression model.

View Article and Find Full Text PDF

Purpose: To assess whether postlingual onset and shorter duration of deafness before cochlear implant (CI) provision predict higher speech intelligibility results of CI users.

Methods: For an objective judgement of speech intelligibility, we used an automatic speech recognition system computing the word recognition rate (WR) of 50 adult CI users and 50 age-matched control individuals. All subjects were recorded reading a standardized text.

View Article and Find Full Text PDF

To assess the impact of Parkinson's disease (PD) on spontaneous discourse, we conducted computerized analyses of brief monologues produced by 51 patients and 50 controls. We explored differences in semantic fields (via latent semantic analysis), grammatical choices (using part-of-speech tagging), and word-level repetitions (with graph embedding tools). Although overall output was quantitatively similar between groups, patients relied less heavily on action-related concepts and used more subordinate structures.

View Article and Find Full Text PDF

This paper evaluates the accuracy of different characterization methods for the automatic detection of multiple speech disorders. The speech impairments considered include dysphonia in people with Parkinson's disease (PD), dysphonia diagnosed in patients with different laryngeal pathologies (LP), and hypernasality in children with cleft lip and palate (CLP). Four different methods are applied to analyze the voice signals including noise content measures, spectral-cepstral modeling, nonlinear features, and measurements to quantify the stability of the fundamental frequency.

View Article and Find Full Text PDF

Due to low intra- and interrater reliability, perceptual voice evaluation should be supported by objective, automatic methods. In this study, text-based, computer-aided prosodic analysis and measurements of connected speech were combined in order to model perceptual evaluation of the German Roughness-Breathiness-Hoarseness (RBH) scheme. 58 connected speech samples (43 women and 15 men; 48.

View Article and Find Full Text PDF

Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria.

View Article and Find Full Text PDF

Objective: Automatic intelligibility assessment using automatic speech recognition is usually language specific. In this study, a language-independent approach is proposed. It uses models that are trained with Flemish speech, and it is applied to assess chronically hoarse German speakers.

View Article and Find Full Text PDF

Visual attention allocation of adolescent girls with and without an eating disorder while viewing body images of underweight, normal-weight and overweight women was studied using eye tracking. While all girls attended more to specific body parts (e.g.

View Article and Find Full Text PDF

Objectives/hypothesis: Automatic voice evaluation is usually performed on stable sections of sustained vowels, which often cannot capture hoarseness properly. The measures cepstral peak prominence (CPP) and smoothed CPP (CPPS) do not require exact determination of the cycles of fundamental frequency like established perturbation-based measures. They can also be applied to text recordings.

View Article and Find Full Text PDF

Objective assessment of intelligibility on the telephone is desirable for voice and speech assessment and rehabilitation. A total of 82 patients after partial laryngectomy read a standardized text which was synchronously recorded by a headset and via telephone. Five experienced raters assessed intelligibility perceptually on a five-point scale.

View Article and Find Full Text PDF

Objective: One aspect of voice and speech evaluation after laryngeal cancer is acoustic analysis. Perceptual evaluation by expert raters is a standard in the clinical environment for global criteria such as overall quality or intelligibility. So far, automatic approaches evaluate acoustic properties of pathologic voices based on voiced/unvoiced distinction and fundamental frequency analysis of sustained vowels.

View Article and Find Full Text PDF

Purpose: Treatment of oral carcinomas often causes reduced speech intelligibility. It was the aim of this study to objectively evaluate the speech intelligibility of patients after multimodal therapy for oral squamous cell carcinoma (OSCC) with a computer-based, automatic speech recognition system.

Materials And Methods: The speech intelligibility of 59 patients after multimodal tumor treatment for OSCC, located at the lateral tongue, floor of the mouth, or the alveolar crest of the lower jaw, was objectively analyzed by a computer-based speech recognition system that calculates the percentage of correct word recognition (WR).

View Article and Find Full Text PDF