J Am Med Inform Assoc
December 2021
Objective: Clinical registries-structured databases of demographic, diagnosis, and treatment information-play vital roles in retrospective studies, operational planning, and assessment of patient eligibility for research, including clinical trials. Registry curation, a manual and time-intensive process, is always costly and often impossible for rare or underfunded diseases. Our goal was to evaluate the feasibility of natural language inference (NLI) as a scalable solution for registry curation.
View Article and Find Full Text PDFElectronic health records (EHRs) are becoming a vital source of data for healthcare quality improvement, research, and operations. However, much of the most valuable information contained in EHRs remains buried in unstructured text. The field of clinical text mining has advanced rapidly in recent years, transitioning from rule-based approaches to machine learning and, more recently, deep learning.
View Article and Find Full Text PDFUnlabelled: Acute hypoxemic respiratory failure is the major complication of coronavirus disease 2019, yet optimal respiratory support strategies are uncertain. We aimed to describe outcomes with high-flow oxygen delivered through nasal cannula and noninvasive positive pressure ventilation in coronavirus disease 2019 acute hypoxemic respiratory failure and identify individual factors associated with noninvasive respiratory support failure.
Design: Retrospective cohort study to describe rates of high-flow oxygen delivered through nasal cannula and/or noninvasive positive pressure ventilation success (live discharge without endotracheal intubation).
Purpose: Rare genetic conditions like Down syndrome (DS) are historically understudied. Infection is a leading cause of mortality in DS, along with cardiac anomalies. Currently, it is unknown how the COVID-19 pandemic affects individuals with DS.
View Article and Find Full Text PDFSleep quality has been directly linked to cognitive function, quality of life, and a variety of serious diseases across many clinical domains. Standard methods for assessing sleep involve overnight studies in hospital settings, which are uncomfortable, expensive, not representative of real sleep, and difficult to conduct on a large scale. Recently, numerous commercial digital devices have been developed that record physiological data, such as movement, heart rate, and respiratory rate, which can act as a proxy for sleep quality in lieu of standard electroencephalogram recording equipment.
View Article and Find Full Text PDFBackground: Acute and chronic low back pain (LBP) are different conditions with different treatments. However, they are coded in electronic health records with the same International Classification of Diseases, 10th revision (ICD-10) code (M54.5) and can be differentiated only by retrospective chart reviews.
View Article and Find Full Text PDFBackground: N-of-1 trials promise to help individuals make more informed decisions about treatment selection through structured experiments that compare treatment effectiveness by alternating treatments and measuring their impacts in a single individual. We created a digital platform that automates the design, administration, and analysis of N-of-1 trials. Our first N-of-1 trial, the app-based Brain Boost Study, invited individuals to compare the impacts of two commonly consumed substances (caffeine and L-theanine) on their cognitive performance.
View Article and Find Full Text PDFBackground: Recent advances in molecular biology, sensors, and digital medicine have led to an explosion of products and services for high-resolution monitoring of individual health. The N-of-1 study has emerged as an important methodological tool for harnessing these new data sources, enabling researchers to compare the effectiveness of health interventions at the level of a single individual.
Objective: N-of-1 studies are susceptible to several design flaws.
Motivation: The biomedical community's collective understanding of how chemicals, genes and phenotypes interact is distributed across the text of over 24 million research articles. These interactions offer insights into the mechanisms behind higher order biochemical phenomena, such as drug-drug interactions and variations in drug response across individuals. To assist their curation at scale, we must understand what relationship types are possible and map unstructured natural language descriptions onto these structured classes.
View Article and Find Full Text PDFJ Am Med Inform Assoc
June 2018
Objective: Distributional semantics algorithms, which learn vector space representations of words and phrases from large corpora, identify related terms based on contextual usage patterns. We hypothesize that distributional semantics can speed up lexicon expansion in a clinical domain, radiology, by unearthing synonyms from the corpus.
Materials And Methods: We apply word2vec, a distributional semantics software package, to the text of radiology notes to identify synonyms for RadLex, a structured lexicon of radiology terms.
PLoS Comput Biol
July 2015
The published biomedical research literature encompasses most of our understanding of how drugs interact with gene products to produce physiological responses (phenotypes). Unfortunately, this information is distributed throughout the unstructured text of over 23 million articles. The creation of structured resources that catalog the relationships between drugs and genes would accelerate the translation of basic molecular knowledge into discoveries of genomic biomarkers for drug response and prediction of unexpected drug-drug interactions.
View Article and Find Full Text PDFJ Am Med Inform Assoc
January 2015
Objective: The trade-off between the speed and simplicity of dictionary-based term recognition and the richer linguistic information provided by more advanced natural language processing (NLP) is an area of active discussion in clinical informatics. In this paper, we quantify this trade-off among text processing systems that make different trade-offs between speed and linguistic understanding. We tested both types of systems in three clinical research tasks: phase IV safety profiling of a drug, learning adverse drug-drug interactions, and learning used-to-treat relationships between drugs and indications.
View Article and Find Full Text PDFThe anticoagulant warfarin has >30 million prescriptions per year in the United States. Doses can vary 20-fold between patients, and incorrect dosing can result in serious adverse events. Variation in warfarin pharmacokinetic and pharmacodynamic genes, such as CYP2C9 and VKORC1, do not fully explain the dose variability in African Americans.
View Article and Find Full Text PDFThe biomedical literature presents a uniquely challenging text mining problem. Sentences are long and complex, the subject matter is highly specialized with a distinct vocabulary, and producing annotated training data for this domain is time consuming and expensive. In this environment, unsupervised text mining methods that do not rely on annotated training data are valuable.
View Article and Find Full Text PDFObjective: Mental illness is the leading cause of disability in the USA, but boundaries between different mental illnesses are notoriously difficult to define. Electronic medical records (EMRs) have recently emerged as a powerful new source of information for defining the phenotypic signatures of specific diseases. We investigated how EMR-based text mining and statistical analysis could elucidate the phenotypic boundaries of three important neuropsychiatric illnesses-autism, bipolar disorder, and schizophrenia.
View Article and Find Full Text PDFHealth-care associated infections are a major problem in our society, accounting for tens of thousands of patient deaths and millions of dollars in wasted health care expenditures each year. Many of these infections are caused by bacteria that are transmitted from patient to patient either through direct contact or via the hands or clothing of health care workers. Because of the complexity of bacterial transmission routes in health care settings, computational approaches are essential, though often analytically intractable.
View Article and Find Full Text PDFDrug-drug interactions (DDIs) are an emerging threat to public health. Recent estimates indicate that DDIs cause nearly 74000 emergency room visits and 195000 hospitalizations each year in the USA. Current approaches to DDI discovery, which include Phase IV clinical trials and post-marketing surveillance, are insufficient for detecting many DDIs and do not alert the public to potentially dangerous DDIs before a drug enters the market.
View Article and Find Full Text PDFBecause breast tissue composition partially predicts breast cancer risk, classification of mammography reports by breast tissue composition is important from both a scientific and clinical perspective. A method is presented for using the unstructured text of mammography reports to classify them into BI-RADS breast tissue composition categories. An algorithm that uses regular expressions to automatically determine BI-RADS breast tissue composition classes for unstructured mammography reports was developed.
View Article and Find Full Text PDFPac Symp Biocomput
December 2013
Drug-drug interactions (DDIs) can occur when two drugs interact with the same gene product. Most available information about gene-drug relationships is contained within the scientific literature, but is dispersed over a large number of publications, with thousands of new publications added each month. In this setting, automated text mining is an attractive solution for identifying gene-drug relationships and aggregating them to predict novel DDIs.
View Article and Find Full Text PDFThe evolution of antibiotic resistance (AR) increases treatment cost and probability of failure, threatening human health worldwide. The relative importance of individual antibiotic use, environmental transmission and rates of introduction of resistant bacteria in explaining community AR patterns is poorly understood. Evaluating their relative importance requires studying a region where they vary.
View Article and Find Full Text PDFGroup B Streptococcus (GBS) remains a major cause of neonatal sepsis and is an emerging cause of invasive bacterial infections. The 9 known serotypes vary in virulence, and there is little cross-immunity. Key parameters for planning an effective vaccination strategy, such as average length of immunity and transmission probabilities by serotype, are unknown.
View Article and Find Full Text PDF