Background: Medical texts present significant domain-specific challenges, and manually curating these texts is a time-consuming and labor-intensive process. To address this, natural language processing (NLP) algorithms have been developed to automate text processing. In the biomedical field, various toolkits for text processing exist, which have greatly improved the efficiency of handling unstructured text. However, these existing toolkits tend to emphasize different perspectives, and none of them offer generation capabilities, leaving a significant gap in the current offerings.
Objective: This study aims to describe the development and preliminary evaluation of Ascle. Ascle is tailored for biomedical researchers and clinical staff with an easy-to-use, all-in-one solution that requires minimal programming expertise. For the first time, Ascle provides 4 advanced and challenging generative functions: question-answering, text summarization, text simplification, and machine translation. In addition, Ascle integrates 12 essential NLP functions, along with query and search capabilities for clinical databases.
Methods: We fine-tuned 32 domain-specific language models and evaluated them thoroughly on 27 established benchmarks. In addition, for the question-answering task, we developed a retrieval-augmented generation (RAG) framework for large language models that incorporated a medical knowledge graph with ranking techniques to enhance the reliability of generated answers. Additionally, we conducted a physician validation to assess the quality of generated content beyond automated metrics.
Results: The fine-tuned models and RAG framework consistently enhanced text generation tasks. For example, the fine-tuned models improved the machine translation task by 20.27 in terms of BLEU score. In the question-answering task, the RAG framework raised the ROUGE-L score by 18% over the vanilla models. Physician validation of generated answers showed high scores for readability (4.95/5) and relevancy (4.43/5), with a lower score for accuracy (3.90/5) and completeness (3.31/5).
Conclusions: This study introduces the development and evaluation of Ascle, a user-friendly NLP toolkit designed for medical text generation. All code is publicly available through the Ascle GitHub repository. All fine-tuned language models can be accessed through Hugging Face.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11487205 | PMC |
http://dx.doi.org/10.2196/60601 | DOI Listing |
PLOS Digit Health
January 2025
Rwanda Ministry of Health, Kigali, Rwanda.
Community isolation of patients with communicable infectious diseases limits spread of pathogens but our understanding of isolated patients' needs and challenges is incomplete. Rwanda deployed a digital health service nationally to assist public health clinicians to remotely monitor and support SARS-CoV-2 cases via their mobile phones using daily interactive short message service (SMS) check-ins. We aimed to assess the texting patterns and communicated topics to better understand patient experiences.
View Article and Find Full Text PDFJAMA Netw Open
January 2025
Department of Psychiatry, School of Clinical Medicine, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong.
Importance: Mental health issues among young people are increasingly concerning. Conventional psychological interventions face challenges, including limited staffing, time commitment, and low completion rates.
Objective: To evaluate the effect of a low-intensity online intervention on young people in Hong Kong experiencing moderate or greater mental distress.
Pediatr Cardiol
January 2025
Department of Pediatric Cardiology, Seattle Children's Hospital, Seattle, WA, USA.
Fetal echocardiography (FE) is recommended for parents with congenital heart disease (pCHD) due to a 3-6% recurrence risk of congenital heart disease (CHD). This study aimed to evaluate the cost of FE for detecting neonatal CHD in pCHD. FE data were collected between 12/2015 and 12/2022.
View Article and Find Full Text PDFJ Am Med Inform Assoc
January 2025
Sinclair School of Nursing, University of Missouri, Columbia, MO 65211, United States.
Objective: This study aimed to explore the utilization of a fine-tuned language model to extract expressions related to the Age-Friendly Health Systems 4M Framework (What Matters, Medication, Mentation, and Mobility) from nursing home worker text messages, deploy automated mapping of these expressions to a taxonomy, and explore the created expressions and relationships.
Materials And Methods: The dataset included 21 357 text messages from healthcare workers in 12 Missouri nursing homes. A sample of 860 messages was annotated by clinical experts to form a "Gold Standard" dataset.
Eur Radiol
January 2025
Department of Radiology, Seoul National University College of Medicine, Seoul National University Hospital, Seoul, Republic of Korea.
Objective: This study aimed to develop an open-source multimodal large language model (CXR-LLaVA) for interpreting chest X-ray images (CXRs), leveraging recent advances in large language models (LLMs) to potentially replicate the image interpretation skills of human radiologists.
Materials And Methods: For training, we collected 592,580 publicly available CXRs, of which 374,881 had labels for certain radiographic abnormalities (Dataset 1) and 217,699 provided free-text radiology reports (Dataset 2). After pre-training a vision transformer with Dataset 1, we integrated it with an LLM influenced by the LLaVA network.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!