In language production research, the latency with which speakers produce a spoken response to a stimulus and the onset and offset times of words in longer utterances are key dependent variables. Measuring these variables automatically often yields partially incorrect results. However, exact measurements through the visual inspection of the recordings are extremely time-consuming. We present AlignTool, an open-source alignment tool that establishes preliminarily the onset and offset times of words and phonemes in spoken utterances using Praat, and subsequently performs a forced alignment of the spoken utterances and their orthographic transcriptions in the automatic speech recognition system MAUS. AlignTool creates a Praat TextGrid file for inspection and manual correction by the user, if necessary. We evaluated AlignTool's performance with recordings of single-word and four-word utterances as well as semi-spontaneous speech. AlignTool performs well with audio signals with an excellent signal-to-noise ratio, requiring virtually no corrections. For audio signals of lesser quality, AlignTool still is highly functional but its results may require more frequent manual corrections. We also found that audio recordings including long silent intervals tended to pose greater difficulties for AlignTool than recordings filled with speech, which AlignTool analyzed well overall. We expect that by semi-automatizing the temporal analysis of complex utterances, AlignTool will open new avenues in language production research.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.3758/s13428-017-1002-7 | DOI Listing |
J Acoust Soc Am
January 2025
USC Viterbi School of Engineering, University of Southern California, Los Angeles, California 90089-1455, USA.
Voice quality serves as a rich source of information about speakers, providing listeners with impressions of identity, emotional state, age, sex, reproductive fitness, and other biologically and socially salient characteristics. Understanding how this information is transmitted, accessed, and exploited requires knowledge of the psychoacoustic dimensions along which voices vary, an area that remains largely unexplored. Recent studies of English speakers have shown that two factors related to speaker size and arousal consistently emerge as the most important determinants of quality, regardless of who is speaking.
View Article and Find Full Text PDFFront Psychol
December 2024
Department for General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University, Jena, Germany.
Introduction: Research has shown that women's vocal characteristics change during the menstrual cycle. Further, evidence suggests that individuals alter their voices depending on the context, such as when speaking to a highly attractive person, or a person with a different social status. The present study aimed at investigating the degree to which women's voices change depending on the vocal characteristics of the interaction partner, and how any such changes are modulated by the woman's current menstrual cycle phase.
View Article and Find Full Text PDFJ Deaf Stud Deaf Educ
December 2024
Theory and Practice in Teacher Education, University of Tennessee, Knoxville, TN 37996, United States of America.
This study investigates the communication practices of four teachers in 3rd to 6th grade classrooms with 9 deaf students with limited language proficiency and in stages of emergent writing development. Analyzing language modalities, utterance types, and class interactivity, we found that teachers using American sign language used student-centered approaches, generating a greater number of directives and responsive utterances. They persevered in increasing students' engagement and were successful in clarifying misunderstandings.
View Article and Find Full Text PDFQ J Exp Psychol (Hove)
December 2024
Institute of Cognitive Neuroscience, University College London, London, UK.
The speech-to-song illusion is a phenomenon in which the continuous repetition of a spoken utterance induces the listeners to perceive it as more song-like. Thus far, this perceptual transformation has been observed in mostly European languages, such as English; however, it is unclear whether the illusion is experienced by speakers of Bangla (Bengali), an Indo-Aryan language. The current study, therefore, investigates the illusion in 28 Bangla and 31 English-speaking participants.
View Article and Find Full Text PDFInt J Pediatr Otorhinolaryngol
November 2024
Division of Developmental and Behavioral Pediatrics, Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, USA.
Objective: To explore potential differences in the relationship between executive function (EF) skills and language development when integrating augmentative and alternative communication technology into speech-language therapy for deaf/hard of hearing (DHH) children.
Method: Randomized trial data were analysed to investigate this relationship among children who participated in a Technology-Assisted Language Intervention (TALI) compared to treatment as usual (TAU). Language samples were assessed for pre-post-intervention changes, including mean length of utterance in morphemes (MLU), mean turn length (MTL), and number of different words spoken (NDW).
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!