While efforts to document endangered languages have steadily increased, the phonetic analysis of endangered language data remains a challenge. The transcription of large documentation corpora is, by itself, a tremendous feat. Yet, the process of segmentation remains a bottleneck for research with data of this kind. This paper examines whether a speech processing tool, forced alignment, can facilitate the segmentation task for small data sets, even when the target language differs from the training language. The authors also examined whether a phone set with contextualization outperforms a more general one. The accuracy of two forced aligners trained on English (hmalign and p2fa) was assessed using corpus data from Yoloxóchitl Mixtec. Overall, agreement performance was relatively good, with accuracy at 70.9% within 30 ms for hmalign and 65.7% within 30 ms for p2fa. Segmental and tonal categories influenced accuracy as well. For instance, additional stop allophones in hmalign's phone set aided alignment accuracy. Agreement differences between aligners also corresponded closely with the types of data on which the aligners were trained. Overall, using existing alignment systems was found to have potential for making phonetic analysis of small corpora more efficient, with more allophonic phone sets providing better agreement than general ones.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5392066PMC
http://dx.doi.org/10.1121/1.4816491DOI Listing

Publication Analysis

Top Keywords

endangered language
8
language data
8
phonetic analysis
8
phone set
8
aligners trained
8
data
6
automatic alignment
4
alignment analyze
4
analyze endangered
4
language
4

Similar Publications

This study investigates Navajo verbs produced by four children, ages 4;07 to 11;02, during conversations with their caretakers. Analyses of 1600 verbs demonstrate that the bisyllabic verb form, consisting of a verb stem and a portion of the prefix string, is the most common pattern produced by the children. This indicates that Navajo-speaking children use meaningful units of verbal morphology that do not necessarily adhere to the linguistic boundaries normally ascribed to the Navajo verb complex.

View Article and Find Full Text PDF

Assessing scientific knowledge on Ecuadorian bony fishes from a scientometric perspective.

J Fish Biol

December 2024

Instituto Politécnico Nacional CICIMAR. Av. I.P.N. s/n. Col. Playa Palo de Santa Rita, La Paz, Mexico.

Bony fishes play a pivotal role in Ecuador's social, economic, and ecological aspects. However, the current state of scientific knowledge on this group remains poorly understood. This study aims to assess the scientific output related to Ecuadorian bony fishes, identifying both well-researched and understudied areas.

View Article and Find Full Text PDF
Article Synopsis
  • * A new online portal has been developed to provide up-to-date global distribution data for crayfish and their pathogens, improving accessibility and management decisions.
  • * This database is publicly available, allowing users to easily view, embed, and download data, aiming to enhance conservation planning and biodiversity management in the future.
View Article and Find Full Text PDF

Species recognition is a crucial part of understanding the abundance and distribution of various organisms and is important for biodiversity conservation and management. Traditional vision-based deep learning-driven species recognition requires large amounts of well-labeled, high-quality image data, the collection of which is challenging for rare and endangered species. In addition, recognition methods designed based on specific species have poor generalization ability and are difficult to adapt to new species recognition scenarios.

View Article and Find Full Text PDF

Background: Worldwide, at least 230 million girls and women are affected by female genital mutilation/ cutting (FGM/C). FGM/C violates human rights and can cause irreparable harm and even lead to death. In 2022, more than 100,000 survivors of FGM/C lived in Germany, and more than 17,000 girls were considered at risk.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!