Encapsulating a patient's clinical narrative into a condensed, informative summary is indispensable to clinical coding. The intricate nature of the clinical text makes the summarisation process challenging for clinical coders. Recent developments in large language models (LLMs) have shown promising performance in clinical text summarisation, particularly in radiology and echocardiographic reports, after adaptation to the clinical domain. To explore the summarisation potential of clinical domain adaptation of LLMs, a clinical text dataset, consisting of electronic medical records paired with "Brief Hospital Course" from the MIMIC-III database, was curated. Two open-source LLMs were then fine-tuned, one pre-trained on biomedical datasets and another on a general-content domain on the curated clinical dataset. The performance of the fine-tuned models against their base models were evaluated. The model pre-trained on biomedical data demonstrated superior performance after clinical domain adaptation. This finding highlights the potential benefits of adapting LLMs pre-trained on a related domain over a more generalised domain and suggests the possible role of clinically adapted LLMs as an assistive tool for clinical coders. Future work should explore adapting more advanced models to enhance model performance in higher-quality clinical datasets.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.3233/SHTI240886 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!