Background: When reading lips, many people benefit from additional visual information from the lip movements of the speaker, which is, however, very error prone. Algorithms for lip reading with artificial intelligence based on artificial neural networks significantly improve word recognition but are not available for the German language.
Materials And Methods: A total of 1806 videoclips with only one German-speaking person each were selected, split into word segments, and assigned to word classes using speech-recognition software. In 38,391 video segments with 32 speakers, 18 polysyllabic, visually distinguishable words were used to train and validate a neural network. The 3D Convolutional Neural Network and Gated Recurrent Units models and a combination of both models (GRUConv) were compared, as were different image sections and color spaces of the videos. The accuracy was determined in 5000 training epochs.
Results: Comparison of the color spaces did not reveal any relevant different correct classification rates in the range from 69% to 72%. With a cut to the lips, a significantly higher accuracy of 70% was achieved than when cut to the entire speaker's face (34%). With the GRUConv model, the maximum accuracies were 87% with known speakers and 63% in the validation with unknown speakers.
Conclusion: The neural network for lip reading, which was first developed for the German language, shows a very high level of accuracy, comparable to English-language algorithms. It works with unknown speakers as well and can be generalized with more word classes.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9160146 | PMC |
http://dx.doi.org/10.1007/s00106-021-01143-9 | DOI Listing |
J Comput Assist Tomogr
January 2025
Department of Radiology, George Washington University Hospital, Washington, DC.
The next step in the evolution of static 3-dimensionally (3D) printed models may be the creation of "smart" models, where subcomponents can be seamlessly interacted with through a feedback mechanism, with potential applications in trainee education and patient counseling. Considering the complexity of the ventricular and cisternal systems, they were chosen for segmentation, using Materialize InPrint with outward hollowing using 2.5-mm wall thickness.
View Article and Find Full Text PDFEfficient visual word recognition presumably relies on orthographic prediction error (oPE) representations. On the basis of a transparent neurocognitive computational model rooted in the principles of the predictive coding framework, we postulated that readers optimize their percept by removing redundant visual signals, allowing them to focus on the informative aspects of the sensory input (i.e.
View Article and Find Full Text PDFJ Mater Chem B
January 2025
Department of General Surgery, The Second Xiangya Hospital, Central South University, Changsha, China.
Sulfur-containing small molecules, mainly including cysteine (Cys), homocysteine (Hcy), glutathione (GSH), and hydrogen sulfide (HS), are crucial biomarkers, and their levels in different body locations (living cells, tissues, blood, urine, saliva, ) are inconsistent and constantly changing. Therefore, it is highly meaningful and challenging to synchronously and accurately detect them in complex multi-component samples without mutual interference. In this work, we propose a steric hindrance-regulated probe, NBD-2FDCI, with single excitation dual emissions to achieve self-adaptive detection of four analytes.
View Article and Find Full Text PDFPeerJ
January 2025
Anesthesiology and Reanimation, Central Clinical Hospital, Baku, Azerbaijan.
Background: Patients who are informed about the causes, pathophysiology, treatment and prevention of a disease are better able to participate in treatment procedures in the event of illness. Artificial intelligence (AI), which has gained popularity in recent years, is defined as the study of algorithms that provide machines with the ability to reason and perform cognitive functions, including object and word recognition, problem solving and decision making. This study aimed to examine the readability, reliability and quality of responses to frequently asked keywords about low back pain (LBP) given by three different AI-based chatbots (ChatGPT, Perplexity and Gemini), which are popular applications in online information presentation today.
View Article and Find Full Text PDFOtolaryngol Head Neck Surg
January 2025
Department of Otolaryngology-Head and Neck Surgery, Columbia University Vagelos College of Physicians and Surgeons, NewYork-Presbyterian/Columbia University Irving Medical Center, New York, New York, USA.
Objective: Hearing loss (HL) is associated with depression, but existing datasets are limited by the type of data available for both hearing and mental health conditions. The purpose of this study is to determine if there is an association between HL and depressive disorders within a large bi-institutional electronic health record (EHR) system containing more granular diagnostic information.
Study Design: Cross-sectional epidemiologic study.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!