Corpora of speech of individuals with communication disorders (CSD) are invaluable resources for education and research, but they are costly and hard to build and difficult to share for various reasons. DELAD, which means 'shared' in Swedish, is a project initiated by Professors Nicole Müller and Martin Ball in 2015 that aims to address this issue by establishing a platform for researchers to share datasets of speech disorders with interested audiences. To date four workshops have been held, where selected participants, covering various expertise including researchers in clinical phonetics and linguistics, speech and language therapy, infrastructure specialists, and ethics and legal specialists, participated to discuss relevant issues in setting up such an archive. Positive and steady progress has been made since 2015, including refurbishing the DELAD website (http://delad.net/) with information and application forms for researchers to join and share their datasets and linking with the CLARIN K-Centre for Atypical Communication Expertise (https://ace.ruhosting.nl/) where CSD can be hosted and accessed through the CLARIN B-Centres, The Language Archive (https://tla.mpi.nl/tools/tla-tools/) and TalkBank (https://talkbank.org/). The latest workshop, which was funded by CLARIN (Common Language Resources and Technology Infrastructure) was held as an online event in January 2021 on topics including Data Protection Impact Assessments, reviewing changes in ethics perspectives in academia on sharing CSD, and voice conversion as a mean to pseudonomise speech. This paper reports the latest progress of DELAD and discusses the directions for further advance of the initiative, with information on how researchers can contribute to the repository.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1080/02699206.2021.1913514 | DOI Listing |
Semitic languages such as Hebrew and Arabic are known for having a non-concatenative morphology: words are typically built of a combination of a consonantal root, typically tri-consonantal (e.g., k-t-b "related to writing" in Modern Standard Arabic (MSA)), with a prosodic template.
View Article and Find Full Text PDFJ Acoust Soc Am
January 2025
Leiden University Centre for Linguistics, Leiden University, Leiden, The Netherlands.
Previous studies suggested that pitch characteristics of lexical tones in Standard Chinese influence various sensory perceptions, but whether they iconically bias emotional experience remained unclear. We analyzed the arousal and valence ratings of bi-syllabic words in two corpora (Study 1) and conducted an affect rating experiment using a carefully designed corpus of bi-syllabic words (Study 2). Two-alternative forced-choice tasks further tested the robustness of lexical tones' affective iconicity in an auditory nonce word context (Study 3).
View Article and Find Full Text PDFSci Rep
January 2025
Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE, Minzu University of China, Beijing, 100081, China.
Speech-to-speech translation (S2ST) has evolved from cascade systems which integrate Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS), to end-to-end models. This evolution has been driven by advancements in model performance and the expansion of cross-lingual speech datasets. Despite the paucity of research on Tibetan speech translation, this paper endeavors to tackle the challenge of Tibetan-to-Chinese direct speech-to-speech translation within the multi-task learning framework, employing self-supervised learning (SSL) and sequence-to-sequence model training.
View Article and Find Full Text PDFPlast Reconstr Surg Glob Open
January 2025
Department of Computer Science, Johns Hopkins University, Baltimore, MD.
Artificial intelligence (AI) scribe applications in the healthcare community are in the early adoption phase and offer unprecedented efficiency for medical documentation. They typically use an application programming interface with a large language model (LLM), for example, generative pretrained transformer 4. They use automatic speech recognition on the physician-patient interaction, generating a full medical note for the encounter, together with a draft follow-up e-mail for the patient and, often, recommendations, all within seconds or minutes.
View Article and Find Full Text PDFJ Speech Lang Hear Res
January 2025
Centre for Language Studies, Radboud University, Nijmegen, the Netherlands.
Purpose: In this review article, we present an extensive overview of recent developments in the area of dysarthric speech research. One of the key objectives of speech technology research is to improve the quality of life of its users, as evidenced by the focus of current research trends on creating inclusive conversational interfaces that cater to pathological speech, out of which dysarthric speech is an important example. Applications of speech technology research for dysarthric speech demand a clear understanding of the acoustics of dysarthric speech as well as of speech technologies, including machine learning and deep neural networks for speech processing.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!