Barista: A Framework for Concurrent Speech Processing by USC-SAIL.

Doğan Can James Gibson Colin Vaz Panayiotis G Georgiou Shrikanth S Narayanan

Proc IEEE Int Conf Acoust Speech Signal Process

Signal Analysis and Interpretation Lab, University of Southern California, CA 90089.

Published: May 2014

We present Barista, an open-source framework for concurrent speech processing based on the Kaldi speech recognition toolkit and the libcppa actor library. With Barista, we aim to provide an easy-to-use, extensible framework for constructing highly customizable concurrent (and/or distributed) networks for a variety of speech processing tasks. Each Barista network specifies a flow of data between simple actors, concurrent entities communicating by message passing, modeled after Kaldi tools. Leveraging the fast and reliable concurrency and distribution mechanisms provided by libcppa, Barista lets demanding speech processing tasks, such as real-time speech recognizers and complex training workflows, to be scheduled and executed on parallel (and/or distributed) hardware. Barista is released under the Apache License v2.0.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5012110	PMC
http://dx.doi.org/10.1109/ICASSP.2014.6854212	DOI Listing

Publication Analysis

Top Keywords

speech processing

framework concurrent

concurrent speech

and/or distributed

processing tasks

barista

speech

barista framework

concurrent

processing

Similar Publications

ClinClip: a Multimodal Language Pre-training model integrating EEG data for enhanced English medical listening assessment.

Front Neurosci

January 2025

The Basic Department, The Tourism College of Changchun University, Changchun, China.

Guangyu Sun

Introduction: In the field of medical listening assessments,accurate transcription and effective cognitive load management are critical for enhancing healthcare delivery. Traditional speech recognition systems, while successful in general applications often struggle in medical contexts where the cognitive state of the listener plays a significant role. These conventional methods typically rely on audio-only inputs and lack the ability to account for the listener's cognitive load, leading to reduced accuracy and effectiveness in complex medical environments.

View Article and Find Full Text PDF

Similar Publications

Ghadeer-speech-crowd-corpus: Speech dataset.

Data Brief

February 2025

Computer Science Department, College of Science, University of Baghdad, Iraq.

Ghadeer Qasim Ali Husam Ali Abdulmohsin

The availability of raw data is a considerable challenge across most branches of science. In the absence of data, neither experiments can be conducted nor development can be undertaken. Despite their importance, raw data are still lacking across many scientific fields.

View Article and Find Full Text PDF

Similar Publications

Flexibility between immersion and distancing: Relationship with depressive symptoms and therapeutic alliance.

Psychother Res

January 2025

Department of Social and Behavioural Sciences, University of Maia, Maia, Portugal.

Ricardo Lisboa Eunice Barbosa Inês Moura João Salgado Marlene Sousa

Objectives: High levels of change are linked to the flexibility between immersion and distancing, as well as to higher levels of therapeutic alliance. This study aims to explore the evolution of flexibility between immersion and distancing throughout the entire therapeutic process and its relationship with therapeutic alliance and depressive symptoms in a clinical case.

Method: We analyzed five sessions of a good outcome case of depression undergoing cognitive-behavioral therapy.

View Article and Find Full Text PDF

Similar Publications

Linear incrementality in focus and accentuation processing during sentence production: evidence from eye movements.

Front Hum Neurosci

January 2025

Department of Psychology, Renmin University of China, Beijing, China.

Zhenghua Zhang Qingfang Zhang

Introduction: While considerable research in language production has focused on incremental processing during conceptual and grammatical encoding, prosodic encoding remains less investigated. This study examines whether focus and accentuation processing in speech production follows linear or hierarchical incrementality.

Methods: We employed visual world eye-tracking to investigate how focus and accentuation are processed during sentence production.

View Article and Find Full Text PDF

Similar Publications

Mapping subcortical brain lesions, behavioral and acoustic analysis for early assessment of subacute stroke patients with dysarthria.

Front Neurosci

January 2025

Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.

Juan Liu Rukiye Ruzi Chuyao Jian Qiuyu Wang Shuzhi Zhao

Introduction: Dysarthria is a motor speech disorder frequently associated with subcortical damage. However, the precise roles of the subcortical nuclei, particularly the basal ganglia and thalamus, in the speech production process remain poorly understood.

Methods: The present study aimed to better understand their roles by mapping neuroimaging, behavioral, and speech data obtained from subacute stroke patients with subcortical lesions.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!