Corpora of speech of individuals with communication disorders (CSD) are invaluable resources for education and research, but they are costly and hard to build and difficult to share for various reasons. DELAD, which means 'shared' in Swedish, is a project initiated by Professors Nicole Müller and Martin Ball in 2015 that aims to address this issue by establishing a platform for researchers to share datasets of speech disorders with interested audiences. To date four workshops have been held, where selected participants, covering various expertise including researchers in clinical phonetics and linguistics, speech and language therapy, infrastructure specialists, and ethics and legal specialists, participated to discuss relevant issues in setting up such an archive. Positive and steady progress has been made since 2015, including refurbishing the DELAD website (http://delad.net/) with information and application forms for researchers to join and share their datasets and linking with the CLARIN K-Centre for Atypical Communication Expertise (https://ace.ruhosting.nl/) where CSD can be hosted and accessed through the CLARIN B-Centres, The Language Archive (https://tla.mpi.nl/tools/tla-tools/) and TalkBank (https://talkbank.org/). The latest workshop, which was funded by CLARIN (Common Language Resources and Technology Infrastructure) was held as an online event in January 2021 on topics including Data Protection Impact Assessments, reviewing changes in ethics perspectives in academia on sharing CSD, and voice conversion as a mean to pseudonomise speech. This paper reports the latest progress of DELAD and discusses the directions for further advance of the initiative, with information on how researchers can contribute to the repository.

Download full-text PDF

Source
http://dx.doi.org/10.1080/02699206.2021.1913514DOI Listing

Publication Analysis

Top Keywords

corpora speech
8
speech disorders
8
share datasets
8
speech
5
latest development
4
delad
4
development delad
4
delad project
4
project sharing
4
sharing corpora
4

Similar Publications

Semitic languages such as Hebrew and Arabic are known for having a non-concatenative morphology: words are typically built of a combination of a consonantal root, typically tri-consonantal (e.g., k-t-b "related to writing" in Modern Standard Arabic (MSA)), with a prosodic template.

View Article and Find Full Text PDF

The affective iconicity of lexical tone: Evidence from standard Chinesea).

J Acoust Soc Am

January 2025

Leiden University Centre for Linguistics, Leiden University, Leiden, The Netherlands.

Previous studies suggested that pitch characteristics of lexical tones in Standard Chinese influence various sensory perceptions, but whether they iconically bias emotional experience remained unclear. We analyzed the arousal and valence ratings of bi-syllabic words in two corpora (Study 1) and conducted an affect rating experiment using a carefully designed corpus of bi-syllabic words (Study 2). Two-alternative forced-choice tasks further tested the robustness of lexical tones' affective iconicity in an auditory nonce word context (Study 3).

View Article and Find Full Text PDF

Tibetan-Chinese speech-to-speech translation based on discrete units.

Sci Rep

January 2025

Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE, Minzu University of China, Beijing, 100081, China.

Speech-to-speech translation (S2ST) has evolved from cascade systems which integrate Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS), to end-to-end models. This evolution has been driven by advancements in model performance and the expansion of cross-lingual speech datasets. Despite the paucity of research on Tibetan speech translation, this paper endeavors to tackle the challenge of Tibetan-to-Chinese direct speech-to-speech translation within the multi-task learning framework, employing self-supervised learning (SSL) and sequence-to-sequence model training.

View Article and Find Full Text PDF

Artificial intelligence (AI) scribe applications in the healthcare community are in the early adoption phase and offer unprecedented efficiency for medical documentation. They typically use an application programming interface with a large language model (LLM), for example, generative pretrained transformer 4. They use automatic speech recognition on the physician-patient interaction, generating a full medical note for the encounter, together with a draft follow-up e-mail for the patient and, often, recommendations, all within seconds or minutes.

View Article and Find Full Text PDF

Speech Technology for Automatic Recognition and Assessment of Dysarthric Speech: An Overview.

J Speech Lang Hear Res

January 2025

Centre for Language Studies, Radboud University, Nijmegen, the Netherlands.

Purpose: In this review article, we present an extensive overview of recent developments in the area of dysarthric speech research. One of the key objectives of speech technology research is to improve the quality of life of its users, as evidenced by the focus of current research trends on creating inclusive conversational interfaces that cater to pathological speech, out of which dysarthric speech is an important example. Applications of speech technology research for dysarthric speech demand a clear understanding of the acoustics of dysarthric speech as well as of speech technologies, including machine learning and deep neural networks for speech processing.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!