AI Article Synopsis

  • Medical texts are difficult to manage and time-consuming to curate manually, prompting the development of NLP algorithms to automate this process for improved efficiency in the biomedical field.
  • The study introduces Ascle, a user-friendly tool designed for biomedical researchers that offers generative functions like question-answering and text summarization, along with 12 essential NLP functions and search capabilities.
  • After fine-tuning 32 language models and validating through physician assessments, results showed significant improvements in text generation tasks, with notable increases in machine translation and question-answering accuracy.

Article Abstract

Background: Medical texts present significant domain-specific challenges, and manually curating these texts is a time-consuming and labor-intensive process. To address this, natural language processing (NLP) algorithms have been developed to automate text processing. In the biomedical field, various toolkits for text processing exist, which have greatly improved the efficiency of handling unstructured text. However, these existing toolkits tend to emphasize different perspectives, and none of them offer generation capabilities, leaving a significant gap in the current offerings.

Objective: This study aims to describe the development and preliminary evaluation of Ascle. Ascle is tailored for biomedical researchers and clinical staff with an easy-to-use, all-in-one solution that requires minimal programming expertise. For the first time, Ascle provides 4 advanced and challenging generative functions: question-answering, text summarization, text simplification, and machine translation. In addition, Ascle integrates 12 essential NLP functions, along with query and search capabilities for clinical databases.

Methods: We fine-tuned 32 domain-specific language models and evaluated them thoroughly on 27 established benchmarks. In addition, for the question-answering task, we developed a retrieval-augmented generation (RAG) framework for large language models that incorporated a medical knowledge graph with ranking techniques to enhance the reliability of generated answers. Additionally, we conducted a physician validation to assess the quality of generated content beyond automated metrics.

Results: The fine-tuned models and RAG framework consistently enhanced text generation tasks. For example, the fine-tuned models improved the machine translation task by 20.27 in terms of BLEU score. In the question-answering task, the RAG framework raised the ROUGE-L score by 18% over the vanilla models. Physician validation of generated answers showed high scores for readability (4.95/5) and relevancy (4.43/5), with a lower score for accuracy (3.90/5) and completeness (3.31/5).

Conclusions: This study introduces the development and evaluation of Ascle, a user-friendly NLP toolkit designed for medical text generation. All code is publicly available through the Ascle GitHub repository. All fine-tuned language models can be accessed through Hugging Face.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11487205PMC
http://dx.doi.org/10.2196/60601DOI Listing

Publication Analysis

Top Keywords

text generation
12
language models
12
rag framework
12
natural language
8
language processing
8
text
8
medical text
8
development evaluation
8
text processing
8
evaluation ascle
8

Similar Publications

Community isolation of patients with communicable infectious diseases limits spread of pathogens but our understanding of isolated patients' needs and challenges is incomplete. Rwanda deployed a digital health service nationally to assist public health clinicians to remotely monitor and support SARS-CoV-2 cases via their mobile phones using daily interactive short message service (SMS) check-ins. We aimed to assess the texting patterns and communicated topics to better understand patient experiences.

View Article and Find Full Text PDF

Importance: Mental health issues among young people are increasingly concerning. Conventional psychological interventions face challenges, including limited staffing, time commitment, and low completion rates.

Objective: To evaluate the effect of a low-intensity online intervention on young people in Hong Kong experiencing moderate or greater mental distress.

View Article and Find Full Text PDF

Fetal echocardiography (FE) is recommended for parents with congenital heart disease (pCHD) due to a 3-6% recurrence risk of congenital heart disease (CHD). This study aimed to evaluate the cost of FE for detecting neonatal CHD in pCHD. FE data were collected between 12/2015 and 12/2022.

View Article and Find Full Text PDF

Objective: This study aimed to explore the utilization of a fine-tuned language model to extract expressions related to the Age-Friendly Health Systems 4M Framework (What Matters, Medication, Mentation, and Mobility) from nursing home worker text messages, deploy automated mapping of these expressions to a taxonomy, and explore the created expressions and relationships.

Materials And Methods: The dataset included 21 357 text messages from healthcare workers in 12 Missouri nursing homes. A sample of 860 messages was annotated by clinical experts to form a "Gold Standard" dataset.

View Article and Find Full Text PDF

CXR-LLaVA: a multimodal large language model for interpreting chest X-ray images.

Eur Radiol

January 2025

Department of Radiology, Seoul National University College of Medicine, Seoul National University Hospital, Seoul, Republic of Korea.

Objective: This study aimed to develop an open-source multimodal large language model (CXR-LLaVA) for interpreting chest X-ray images (CXRs), leveraging recent advances in large language models (LLMs) to potentially replicate the image interpretation skills of human radiologists.

Materials And Methods: For training, we collected 592,580 publicly available CXRs, of which 374,881 had labels for certain radiographic abnormalities (Dataset 1) and 217,699 provided free-text radiology reports (Dataset 2). After pre-training a vision transformer with Dataset 1, we integrated it with an LLM influenced by the LLaVA network.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!