AI Article Synopsis

  • Effective genetic diagnosis relies on linking genetic data to detailed clinical information, but manual data entry is time-consuming and prone to bias.
  • Natural language processing (NLP) can streamline this process, but variations in physician notes pose challenges; our methods improve NLP outputs for more accurate automatic diagnosis.
  • We developed a filtering system that enhances gene prioritization by using optimized extracted terms, showing that in 92% of cases, NLP could replace manual extraction, and in 75% of cases, we ranked the correct gene higher with filters applied.

Article Abstract

Effective genetic diagnosis requires the correlation of genetic variant data with detailed phenotypic information. However, manual encoding of clinical data into machine-readable forms is laborious and subject to observer bias. Natural language processing (NLP) of electronic health records has great potential to enhance reproducibility at scale but suffers from idiosyncrasies in physician notes and other medical records. We developed methods to optimize NLP outputs for automated diagnosis. We filtered NLP-extracted Human Phenotype Ontology (HPO) terms to more closely resemble manually extracted terms and identified filter parameters across a three-dimensional space for optimal gene prioritization. We then developed a tiered pipeline that reduces manual effort by prioritizing smaller subsets of genes to consider for genetic diagnosis. Our filtering pipeline enabled NLP-based extraction of HPO terms to serve as a sufficient replacement for manual extraction in 92% of prospectively evaluated cases. In 75% of cases, the correct causal gene was ranked higher with our applied filters than without any filters. We describe a framework that can maximize the utility of NLP-based phenotype extraction for gene prioritization and diagnosis. The framework is implemented within a cloud-based modular architecture that can be deployed across health and research institutions.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8432593PMC
http://dx.doi.org/10.1016/j.xhgg.2021.100035DOI Listing

Publication Analysis

Top Keywords

natural language
8
language processing
8
genetic diagnosis
8
hpo terms
8
gene prioritization
8
data-driven architecture
4
architecture natural
4
processing improve
4
improve phenotyping
4
phenotyping efficiency
4

Similar Publications

Purpose: This scoping review aims to summarize online health information seeking (OHIS) behavior among breast cancer patients and survivors, identify research gaps, and offer insights for future studies.

Methods: Following Arksey and O'Malley's framework, we conducted a review across PubMed, Web of Science, CINAHL, MEDLINE, Cochrane, Embase, CNKI, Wanfang Data, and SinoMed, covering literature from 1 January 2014 to 13 August 2023. A total of 1,368 articles were identified, with 33 meeting the inclusion criteria.

View Article and Find Full Text PDF

Introduction: Mental disorders, such as anxiety and depression, significantly impacted global populations in 2019 and 2020, with COVID-19 causing a surge in prevalence. They affect 13.4% of the people worldwide, and 21% of Iranians have experienced them.

View Article and Find Full Text PDF

Genomic language models: opportunities and challenges.

Trends Genet

January 2025

Computer Science Division, University of California, Berkeley, CA, USA; Department of Statistics, University of California, Berkeley, CA, USA; Center for Computational Biology, University of California, Berkeley, CA, USA. Electronic address:

Large language models (LLMs) are having transformative impacts across a wide range of scientific fields, particularly in the biomedical sciences. Just as the goal of natural language processing is to understand sequences of words, a major objective in biology is to understand biological sequences. Genomic language models (gLMs), which are LLMs trained on DNA sequences, have the potential to significantly advance our understanding of genomes and how DNA elements at various scales interact to give rise to complex functions.

View Article and Find Full Text PDF

Perspectives surrounding robotic total hip arthroplasty: a cross-sectional analysis using natural language processing.

Can J Surg

January 2025

From the Faculty of Medicine, Université de Montréal, Montréal, Que. (Levett); the Department of Neurology and Neurosurgery, McGill University, Montréal, Que. (Elkaim); the Department of Orthopaedic Surgery, McGill University, Jewish General Hospital, Montréal, Que. (Zukor, Huk, Antoniou)

Background: Robotic technology has been used in total hip arthroplasty (THA) for several years. Despite the advances in this field, perspectives surrounding robotic THA are not fully understood. This study aimed to characterize the landscape of robotic THA on social media.

View Article and Find Full Text PDF

Applying AI to Structured Real-World Data for Pharmacovigilance Purposes: Scoping Review.

J Med Internet Res

December 2024

Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé - LIMICS, Inserm, Université Sorbonne Paris-Nord, Sorbonne Université, Paris, France.

Background: Artificial intelligence (AI) applied to real-world data (RWD; eg, electronic health care records) has been identified as a potentially promising technical paradigm for the pharmacovigilance field. There are several instances of AI approaches applied to RWD; however, most studies focus on unstructured RWD (conducting natural language processing on various data sources, eg, clinical notes, social media, and blogs). Hence, it is essential to investigate how AI is currently applied to structured RWD in pharmacovigilance and how new approaches could enrich the existing methodology.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!