A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 176

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 316
Function: require_once

PhenoID, a language model normalizer of physical examinations from genetics clinical notes. | LitMetric

AI Article Synopsis

  • Accurate documentation of phenotypes in electronic health records (EHR) is crucial for genetic diagnosis, but current variations in reporting hinder computational analysis and existing NLP methods are not fully trained on EHR data.
  • A new system called PhenoID was developed at the Children's Hospital of Philadelphia, which includes a manually annotated corpus of over 3,000 dysmorphology observations aligned with the Human Phenotype Ontology (HPO) to enhance phenotype extraction from clinical notes.
  • PhenoID outperformed prior methods with a performance score of 0.717, highlighting the potential of transformer-based models for extracting genetic phenotypes, though it also revealed issues with the HPO terminology and understanding by the models.

Article Abstract

Background: Phenotypes identified during dysmorphology physical examinations are critical to genetic diagnosis and nearly universally documented as free-text in the electronic health record (EHR). Variation in how phenotypes are recorded in free-text makes large-scale computational analysis extremely challenging. Existing natural language processing (NLP) approaches to address phenotype extraction are trained largely on the biomedical literature or on case vignettes rather than actual EHR data.

Methods: We implemented a tailored system at the Children's Hospital of Philadelpia that allows clinicians to document dysmorphology physical exam findings. From the underlying data, we manually annotated a corpus of 3136 organ system observations using the Human Phenotype Ontology (HPO). We provide this corpus publicly. We trained a transformer based NLP system to identify HPO terms from exam observations. The pipeline includes an extractor, which identifies tokens in the sentence expected to contain an HPO term, and a normalizer, which uses those tokens together with the original observation to determine the specific term mentioned.

Findings: We find that our labeler and normalizer NLP pipeline, which we call PhenoID, achieves state-of-the-art performance for the dysmorphology physical exam phenotype extraction task. PhenoID's performance on the test set was 0.717, compared to the nearest baseline system (Pheno-Tagger) performance of 0.633. An analysis of our system's normalization errors shows possible imperfections in the HPO terminology itself but also reveals a lack of semantic understanding by our transformer models.

Interpretation: Transformers-based NLP models are a promising approach to genetic phenotype extraction and, with recent development of larger pre-trained causal language models, may improve semantic understanding in the future. We believe our results also have direct applicability to more general extraction of medical signs and symptoms.

Funding: US National Institutes of Health.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10614999PMC
http://dx.doi.org/10.1101/2023.10.16.23296894DOI Listing

Publication Analysis

Top Keywords

dysmorphology physical
12
phenotype extraction
12
physical examinations
8
physical exam
8
semantic understanding
8
phenoid language
4
language model
4
model normalizer
4
physical
4
normalizer physical
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!