From Text to Translation: Using Language Models to Prioritize Variants for Clinical Review.

medRxiv

Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, 02115, MA, United States.

Published: December 2024

Despite rapid advances in genomic sequencing, most rare genetic variants remain insufficiently characterized for clinical use, limiting the potential of personalized medicine. When classifying whether a variant is pathogenic, clinical labs adhere to diagnostic guidelines that comprehensively evaluate many forms of evidence including case data, computational predictions, and functional screening. While a substantial amount of clinical evidence has been developed for these variants, the majority cannot be definitively classified as 'pathogenic' or 'benign', and thus persist as 'Variants of Uncertain Significance' (VUS). We processed over 2.4 million plaintext variant summaries from ClinVar, employing sentence-level classification to remove content that does not contain evidence and removing uninformative summaries. We developed ClinVar-BERT to discern clinical evidence within these summaries by fine-tuning a BioBERT-based model with labeled records. When validated classifications from this model against orthogonal functional screening data, ClinVar-BERT significantly separated estimates of functional impact in clinically actionable genes, including (p = × ), (p = × ), and (p = × ). Additionally, ClinVar-BERT achieved an AUROC of 0.927 in classifying ClinVar VUS against this functional screening data. This suggests that ClinVar-BERT is capable of discerning evidence from diagnostic reports and can be used to prioritize variants for re-assessment by diagnostic labs and expert curation panels.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11722495PMC
http://dx.doi.org/10.1101/2024.12.31.24319792DOI Listing

Publication Analysis

Top Keywords

functional screening
12
prioritize variants
8
clinical evidence
8
screening data
8
clinical
5
evidence
5
text translation
4
translation language
4
language models
4
models prioritize
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!