Identification of an ANCA-Associated Vasculitis Cohort Using Deep Learning and Electronic Health Records.

medRxiv

Rheumatology and Allergy Clinical Epidemiology Research Center and Division of Rheumatology, Allergy, and Immunology, and Mongan Institute, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA.

Published: June 2024

Background: ANCA-associated vasculitis (AAV) is a rare but serious disease. Traditional case-identification methods using claims data can be time-intensive and may miss important subgroups. We hypothesized that a deep learning model analyzing electronic health records (EHR) can more accurately identify AAV cases.

Methods: We examined the Mass General Brigham (MGB) repository of clinical documentation from 12/1/1979 to 5/11/2021, using expert-curated keywords and ICD codes to identify a large cohort of potential AAV cases. Three labeled datasets (I, II, III) were created, each containing note sections. We trained and evaluated a range of machine learning and deep learning algorithms for note-level classification, using metrics like positive predictive value (PPV), sensitivity, F-score, area under the receiver operating characteristic curve (AUROC), and area under the precision and recall curve (AUPRC). The deep learning model was further evaluated for its ability to classify AAV cases at the patient-level, compared with rule-based algorithms in 2,000 randomly chosen samples.

Results: Datasets I, II, and III comprised 6,000, 3,008, and 7,500 note sections, respectively. Deep learning achieved the highest AUROC in all three datasets, with scores of 0.983, 0.991, and 0.991. The deep learning approach also had among the highest PPVs across the three datasets (0.941, 0.954, and 0.800, respectively). In a test cohort of 2,000 cases, the deep learning model achieved a PPV of 0.262 and an estimated sensitivity of 0.975. Compared to the best rule-based algorithm, the deep learning model identified six additional AAV cases, representing 13% of the total.

Conclusion: The deep learning model effectively classifies clinical note sections for AAV diagnosis. Its application to EHR notes can potentially uncover additional cases missed by traditional rule-based methods.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11213085PMC
http://dx.doi.org/10.1101/2024.06.09.24308603DOI Listing

Publication Analysis

Top Keywords

deep learning
36
learning model
20
aav cases
12
note sections
12
learning
10
deep
9
anca-associated vasculitis
8
electronic health
8
health records
8
datasets iii
8

Similar Publications

Background: Comprehensive clinical data regarding factors influencing the individual disease course of patients with movement disorders treated with deep brain stimulation might help to better understand disease progression and to develop individualized treatment approaches.

Methods: The clinical core data set was developed by a multidisciplinary working group within the German transregional collaborative research network ReTune. The development followed standardized methodology comprising review of available evidence, a consensus process and performance of the first phase of the study.

View Article and Find Full Text PDF

AiGPro: a multi-tasks model for profiling of GPCRs for agonist and antagonist.

J Cheminform

January 2025

School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, 06978, Seoul, Republic of Korea.

G protein-coupled receptors (GPCRs) play vital roles in various physiological processes, making them attractive drug discovery targets. Meanwhile, deep learning techniques have revolutionized drug discovery by facilitating efficient tools for expediting the identification and optimization of ligands. However, existing models for the GPCRs often focus on single-target or a small subset of GPCRs or employ binary classification, constraining their applicability for high throughput virtual screening.

View Article and Find Full Text PDF

Enhancing furcation involvement classification on panoramic radiographs with vision transformers.

BMC Oral Health

January 2025

Department of Periodontics, Affiliated Hospital of Medical School, Nanjing Stomatological Hospital, Research Institute of Stomatology, Nanjing University, Nanjing, China.

Background: The severity of furcation involvement (FI) directly affected tooth prognosis and influenced treatment approaches. However, assessing, diagnosing, and treating molars with FI was complicated by anatomical and morphological variations. Cone-beam computed tomography (CBCT) enhanced diagnostic accuracy for detecting FI and measuring furcation defects.

View Article and Find Full Text PDF

Learning by making - student-made models and creative projects for medical education: systematic review with qualitative synthesis.

BMC Med Educ

January 2025

Department of Anatomy, Clinical Sciences Building, Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, 308323, Singapore.

Study Objective: Student-centered learning and unconventional teaching modalities are gaining popularity in medical education. One notable approach involves engaging students in producing creative projects to complement the learning of preclinical topics. A systematic review was conducted to characterize the impact of creative project-based learning on metacognition and knowledge gains in medical students.

View Article and Find Full Text PDF

scSMD: a deep learning method for accurate clustering of single cells based on auto-encoder.

BMC Bioinformatics

January 2025

Department of Surgery, Shanghai Key Laboratory of Gastric Neoplasms, Shanghai Institute of Digestive Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.

Background: Single-cell RNA sequencing (scRNA-seq) has transformed biological research by offering new insights into cellular heterogeneity, developmental processes, and disease mechanisms. As scRNA-seq technology advances, its role in modern biology has become increasingly vital. This study explores the application of deep learning to single-cell data clustering, with a particular focus on managing sparse, high-dimensional data.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!