In language production research, the latency with which speakers produce a spoken response to a stimulus and the onset and offset times of words in longer utterances are key dependent variables. Measuring these variables automatically often yields partially incorrect results. However, exact measurements through the visual inspection of the recordings are extremely time-consuming. We present AlignTool, an open-source alignment tool that establishes preliminarily the onset and offset times of words and phonemes in spoken utterances using Praat, and subsequently performs a forced alignment of the spoken utterances and their orthographic transcriptions in the automatic speech recognition system MAUS. AlignTool creates a Praat TextGrid file for inspection and manual correction by the user, if necessary. We evaluated AlignTool's performance with recordings of single-word and four-word utterances as well as semi-spontaneous speech. AlignTool performs well with audio signals with an excellent signal-to-noise ratio, requiring virtually no corrections. For audio signals of lesser quality, AlignTool still is highly functional but its results may require more frequent manual corrections. We also found that audio recordings including long silent intervals tended to pose greater difficulties for AlignTool than recordings filled with speech, which AlignTool analyzed well overall. We expect that by semi-automatizing the temporal analysis of complex utterances, AlignTool will open new avenues in language production research.

Download full-text PDF

Source
http://dx.doi.org/10.3758/s13428-017-1002-7DOI Listing

Publication Analysis

Top Keywords

spoken utterances
12
aligntool
8
alignment spoken
8
language production
8
onset offset
8
offset times
8
speech aligntool
8
audio signals
8
corrections audio
8
utterances
6

Similar Publications

Biological, linguistic, and individual factors govern voice qualitya).

J Acoust Soc Am

January 2025

USC Viterbi School of Engineering, University of Southern California, Los Angeles, California 90089-1455, USA.

Voice quality serves as a rich source of information about speakers, providing listeners with impressions of identity, emotional state, age, sex, reproductive fitness, and other biologically and socially salient characteristics. Understanding how this information is transmitted, accessed, and exploited requires knowledge of the psychoacoustic dimensions along which voices vary, an area that remains largely unexplored. Recent studies of English speakers have shown that two factors related to speaker size and arousal consistently emerge as the most important determinants of quality, regardless of who is speaking.

View Article and Find Full Text PDF

Voice of a woman: influence of interaction partner characteristics on cycle dependent vocal changes in women.

Front Psychol

December 2024

Department for General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University, Jena, Germany.

Introduction: Research has shown that women's vocal characteristics change during the menstrual cycle. Further, evidence suggests that individuals alter their voices depending on the context, such as when speaking to a highly attractive person, or a person with a different social status. The present study aimed at investigating the degree to which women's voices change depending on the vocal characteristics of the interaction partner, and how any such changes are modulated by the woman's current menstrual cycle phase.

View Article and Find Full Text PDF

A comparative study of how teachers communicate in deaf education classrooms.

J Deaf Stud Deaf Educ

December 2024

Theory and Practice in Teacher Education, University of Tennessee, Knoxville, TN 37996, United States of America.

This study investigates the communication practices of four teachers in 3rd to 6th grade classrooms with 9 deaf students with limited language proficiency and in stages of emergent writing development. Analyzing language modalities, utterance types, and class interactivity, we found that teachers using American sign language used student-centered approaches, generating a greater number of directives and responsive utterances. They persevered in increasing students' engagement and were successful in clarifying misunderstandings.

View Article and Find Full Text PDF

The speech-to-song illusion is a phenomenon in which the continuous repetition of a spoken utterance induces the listeners to perceive it as more song-like. Thus far, this perceptual transformation has been observed in mostly European languages, such as English; however, it is unclear whether the illusion is experienced by speakers of Bangla (Bengali), an Indo-Aryan language. The current study, therefore, investigates the illusion in 28 Bangla and 31 English-speaking participants.

View Article and Find Full Text PDF

Executive functioning and nonverbal cognitive factors associated with response to technology-assisted language intervention.

Int J Pediatr Otorhinolaryngol

November 2024

Division of Developmental and Behavioral Pediatrics, Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, USA.

Objective: To explore potential differences in the relationship between executive function (EF) skills and language development when integrating augmentative and alternative communication technology into speech-language therapy for deaf/hard of hearing (DHH) children.

Method: Randomized trial data were analysed to investigate this relationship among children who participated in a Technology-Assisted Language Intervention (TALI) compared to treatment as usual (TAU). Language samples were assessed for pre-post-intervention changes, including mean length of utterance in morphemes (MLU), mean turn length (MTL), and number of different words spoken (NDW).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!