Automatic speech recognition in vitreo-retinal surgery. A project for a prototypal computer-based voice-controlled vitrectomy machine.

L Rossi D Sacerdoti B Billi G Lesnoni M Orciuolo T Rossi D Sacerdoti L Bertollini

Eur J Ophthalmol

Department of Electronic Engineering, University of Roma La Sapienza, Italy.

Published: March 1997

In the past half decade automatic speech recognition techniques, software and hardware technology have matured enough to support sophisticated medical applications. The project described aimed at introducing a computer-based, voice-controlled prototype system in a simulated vitreo-retinal surgery scenario. The aim was to provide the surgeon with a tool that could significantly improve the quality and ease of work and shorten the duration of intervention. The speech recognition system allows voice entry of simple commands to simulate surgical instrument control, including the infusion pump, vitreous cutter and diathermy. The project relies on a Markov-based, speaker-dependent, commercial isolated-word recognizer, and consists of a specific recognition vocabulary and application software, created and developed by the authors. Results have been encouraging. The system performed well under the test conditions, proving robust, simple to use and accurate (over 97% average word recognition rate). On the basis of their experience, the authors believe that automatic speech recognition technology, though suffering from some limitations such as the need for training, speaker dependence and a relatively small vocabulary, and requiring extensive testing under operating conditions, merits further development and opens new perspectives for a possible new generation of surgical instruments.

Download full-text PDF	Source
http://dx.doi.org/10.1177/112067219600600420	DOI Listing

Publication Analysis

Top Keywords

speech recognition

automatic speech

vitreo-retinal surgery

computer-based voice-controlled

recognition

recognition vitreo-retinal

surgery project

project prototypal

prototypal computer-based

voice-controlled vitrectomy

Similar Publications

LipBengal: Pioneering Bengali lip-reading dataset for pronunciation mapping through lip gestures.

Data Brief

February 2025

Department of Electrical, Electronic and Communication Engineering, Military Institute of Science and Technology (MIST), Dhaka 1216, Bangladesh.

Md Tanvir Rahman Sahed Md Tanjil Islam Aronno Hussain Nyeem Md Abdul Wahed Tashrif Ahsan

The dataset represents a significant advancement in Bengali lip-reading and visual speech recognition research, poised to drive future applications and technological progress. Despite Bengali's global status as the seventh most spoken language with approximately 265 million speakers, linguistically rich and widely spoken languages like Bengali have been largely overlooked by the research community. fills this gap by offering a pioneering dataset tailored for Bengali lip-reading, comprising visual data from 150 speakers across 54 classes, encompassing Bengali phonemes, alphabets, and symbols.

View Article and Find Full Text PDF

Similar Publications

Pupillometry and perceived listening effort for cochlear implant users-a comparison of three speech-in-noise tests.

Int J Audiol

January 2025

Department of Otorhinolaryngology and Head & Neck Surgery, Leiden University Medical Center, Leiden, Netherlands.

Hendrik Christiaan Stronks Paula Louisa Jansen Robin van Deurzen Jeroen Johannes Briaire Johan Hubertus Maria Frijns

Objective: Measuring listening effort using pupillometry is challenging in cochlear implant (CI) users. We assess three validated speech tests (Matrix, LIST, and DIN) to identify the optimal speech material for measuring peak-pupil-dilation (PPD) in CI users as a function of signal-to-noise ratio (SNR).

Design: Speech tests were administered in quiet and two noisy conditions, namely at the speech recognition threshold (0 dB re SRT), i.

View Article and Find Full Text PDF

Similar Publications

Tibetan-Chinese speech-to-speech translation based on discrete units.

Sci Rep

January 2025

Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE, Minzu University of China, Beijing, 100081, China.

Zairan Gong Xiaona Xu Yue Zhao

Speech-to-speech translation (S2ST) has evolved from cascade systems which integrate Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS), to end-to-end models. This evolution has been driven by advancements in model performance and the expansion of cross-lingual speech datasets. Despite the paucity of research on Tibetan speech translation, this paper endeavors to tackle the challenge of Tibetan-to-Chinese direct speech-to-speech translation within the multi-task learning framework, employing self-supervised learning (SSL) and sequence-to-sequence model training.

View Article and Find Full Text PDF

Similar Publications

Some Challenging Questions About Outcomes in Children With Cochlear Implants.

Perspect ASHA Spec Interest Groups

December 2024

DeVault Otologic Research Laboratory, Department of Otolaryngology-Head and Neck Surgery, Indiana University School of Medicine, Indianapolis.

Susan T Sehgal Irina Castellanos William G Kronenberger David B Pisoni

Purpose: Cochlear implants (CIs) have improved the quality of life for many children with severe-to-profound sensorineural hearing loss. Despite the reported CI benefits of improved speech recognition, speech intelligibility, and spoken language processing, large individual differences in speech and language outcomes are still consistently reported in the literature. The enormous variability in CI outcomes has made it challenging to predict which children may be at high risk for limited benefits and how potential risk factors can be improved with interventions.

View Article and Find Full Text PDF

Similar Publications

Is simpler better? Semantic content modulates the emotional prosody perception in Mandarin-speaking children with autism spectrum disorder.

J Commun Disord

January 2025

School of Foreign Studies, China University of Petroleum (East China), Qingdao, China. Electronic address:

Ting Wang Li Xia Lulu Cheng

Introduction: It is still under debate whether and how semantic content will modulate the emotional prosody perception in children with autism spectrum disorder (ASD). The current study aimed to investigate the issue using two experiments by systematically manipulating semantic information in Chinese disyllabic words.

Method: The present study explored the potential modulation of semantic content complexity on emotional prosody perception in Mandarin-speaking children with ASD.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!