Automatic Diagnosis Labeling of Cardiovascular MRI by Using Semisupervised Natural Language Processing of Text Reports.

Sameer Zaman Camille Petri Kavitha Vimalesvaran James Howard Anil Bharath Darrel Francis Nicholas Peters Graham D Cole Nick Linton

Radiol Artif Intell

National Heart and Lung Institute, Imperial College London, Hammersmith Hospital, Du Cane Road, Second Floor B Block, London W12 0HS, England (S.Z., C.P., K.V., J.H., D.F., N.P., G.D.C.); Imperial College Healthcare National Health Service Trust, London, England (J.H., D.F., N.P., G.D.C., N.L.); and Department of Bioengineering, Imperial College London, London, England (A.B., N.L.).

Published: January 2022

Purpose: To assess whether the semisupervised natural language processing (NLP) of text from clinical radiology reports could provide useful automated diagnosis categorization for ground truth labeling to overcome manual labeling bottlenecks in the machine learning pipeline.

Materials And Methods: In this retrospective study, 1503 text cardiac MRI reports from 2016 to 2019 were manually annotated for five diagnoses by clinicians: normal, dilated cardiomyopathy (DCM), hypertrophic cardiomyopathy, myocardial infarction (MI), and myocarditis. A semisupervised method that uses bidirectional encoder representations from transformers (BERT) pretrained on 1.14 million scientific publications was fine-tuned by using the manually extracted labels, with a report dataset split into groups of 801 for training, 302 for validation, and 400 for testing. The model's performance was compared with two traditional NLP models: a rule-based model and a support vector machine (SVM) model. The models' F1 scores and receiver operating characteristic curves were used to analyze performance.

Results: After 15 epochs, the F1 scores on the test set of 400 reports were as follows: normal, 84%; DCM, 79%; hypertrophic cardiomyopathy, 86%; MI, 91%; and myocarditis, 86%. The pooled F1 score and area under the receiver operating curve were 86% and 0.96, respectively. On the same test set, the BERT model had a higher performance than the rule-based model (F1 score, 42%) and SVM model (F1 score, 82%). Diagnosis categories classified by using the BERT model performed the labeling of 1000 MR images in 0.2 second.

Conclusion: The developed model used labels extracted from radiology reports to provide automated diagnosis categorization of MR images with a high level of performance. Semisupervised Learning, Diagnosis/Classification/Application Domain, Named Entity Recognition, MRI © RSNA, 2021.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8823679	PMC
http://dx.doi.org/10.1148/ryai.210085	DOI Listing

Publication Analysis

Top Keywords

semisupervised natural

natural language

language processing

radiology reports

reports provide

provide automated

automated diagnosis

diagnosis categorization

hypertrophic cardiomyopathy

rule-based model

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!