Background: Large language models (LLMs) such as GPT-4 hold great promise as transformative tools in health care, ranging from automating administrative tasks to augmenting clinical decision making. However, these models also pose a danger of perpetuating biases and delivering incorrect medical diagnoses, which can have a direct, harmful impact on medical care. We aimed to assess whether GPT-4 encodes racial and gender biases that impact its use in health care.
View Article and Find Full Text PDFJ Am Med Inform Assoc
January 2024
Objective: Reflex testing protocols allow clinical laboratories to perform second line diagnostic tests on existing specimens based on the results of initially ordered tests. Reflex testing can support optimal clinical laboratory test ordering and diagnosis. In current clinical practice, reflex testing typically relies on simple "if-then" rules; however, this limits the opportunities for reflex testing since most test ordering decisions involve more complexity than traditional rule-based approaches would allow.
View Article and Find Full Text PDFDespite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C).
View Article and Find Full Text PDFObjective: Reflex testing protocols allow clinical laboratories to perform second line diagnostic tests on existing specimens based on the results of initially ordered tests. Reflex testing can support optimal clinical laboratory test ordering and diagnosis. In current clinical practice, reflex testing typically relies on simple "if-then" rules; however, this limits their scope since most test ordering decisions involve more complexity than a simple rule will allow.
View Article and Find Full Text PDFClinical artificial intelligence (AI)/machine learning (ML) is anticipated to offer new abilities in clinical decision support, diagnostic reasoning, precision medicine, clinical operational support, and clinical research, but careful concern is needed to ensure these technologies work effectively in the clinic. Here, we detail the clinical ML/AI design process, identifying several key questions and detailing several common forms of issues that arise with ML tools, as motivated by real-world examples, such that clinicians and researchers can better anticipate and correct for such issues in their own use of ML/AI techniques.
View Article and Find Full Text PDFBackground: Many U.S. institutions have adopted postsurgical opioid-prescribing guidelines to standardize prescribing practices, and yet there is inherent variability in patients' opioid consumption after surgery.
View Article and Find Full Text PDFThe increasing availability of large collections of electronic health record (EHR) data and unprecedented technical advances in deep learning (DL) have sparked a surge of research interest in developing DL based clinical decision support systems for diagnosis, prognosis, and treatment. Despite the recognition of the value of deep learning in healthcare, impediments to further adoption in real healthcare settings remain due to the black-box nature of DL. Therefore, there is an emerging need for interpretable DL, which allows end users to evaluate the model decision making to know whether to accept or reject predictions and recommendations before an action is taken.
View Article and Find Full Text PDFObjective: Large amounts of health data are becoming available for biomedical research. Synthesizing information across databases may capture more comprehensive pictures of patient health and enable novel research studies. When no gold standard mappings between patient records are available, researchers may probabilistically link records from separate databases and analyze the linked data.
View Article and Find Full Text PDFBackground: Obesity has been linked to severe clinical outcomes among people who are hospitalized with coronavirus disease 2019 (COVID-19). We tested the hypothesis that visceral adipose tissue (VAT) is associated with severe outcomes in patients hospitalized with COVID-19, independent of body mass index (BMI).
Methods: We analyzed data from the Massachusetts General Hospital COVID-19 Data Registry, which included patients admitted with polymerase chain reaction-confirmed severe acute respiratory syndrome coronavirus 2 infection from March 11 to May 4, 2020.
Background: Body composition is associated with mortality; however its routine assessment is too time-consuming.
Purpose: To demonstrate the value of artificial intelligence (AI) to extract body composition measures from routine studies, we aimed to develop a fully automated AI approach to measure fat and muscles masses, to validate its clinical discriminatory value, and to provide the code, training data and workflow solutions to facilitate its integration into local practice.
Methods: We developed a neural network that quantified the tissue components at the L3 vertebral body level using data from the Liver Tumor Challenge (LiTS) and a pancreatic cancer cohort.
Med Image Comput Comput Assist Interv
October 2020
We propose and demonstrate a novel machine learning algorithm that assesses pulmonary edema severity from chest radiographs. While large publicly available datasets of chest radiographs and free-text radiology reports exist, only limited numerical edema severity labels can be extracted from radiology reports. This is a significant challenge in learning such models for image classification.
View Article and Find Full Text PDFMachine learning has been suggested as a means of identifying individuals at greatest risk for hospital readmission, including psychiatric readmission. We sought to compare the performance of predictive models that use interpretable representations derived via topic modeling to the performance of human experts and nonexperts. We examined all 5076 admissions to a general psychiatry inpatient unit between 2009 and 2016 using electronic health records.
View Article and Find Full Text PDFThe promise of precision medicine lies in data diversity. More than the sheer size of biomedical data, it is the layering of multiple data modalities, offering complementary perspectives, that is thought to enable the identification of patient subgroups with shared pathophysiology. In the present study, we use autism to test this notion.
View Article and Find Full Text PDFBackground: Tumor response to therapy is often assessed by measuring change in liver lesion size between consecutive MRIs. However, these evaluations are both tedious and time-consuming for clinical radiologists.
Purpose: In this study, we sought to develop a convolutional neural network to detect liver metastases on MRI and applied this algorithm to assess change in tumor size on consecutive examinations.
Bioinformatics
June 2020
Motivation: In evidence-based medicine, defining a clinical question in terms of the specific patient problem aids the physicians to efficiently identify appropriate resources and search for the best available evidence for medical treatment. In order to formulate a well-defined, focused clinical question, a framework called PICO is widely used, which identifies the sentences in a given medical text that belong to the four components typically reported in clinical trials: Participants/Problem (P), Intervention (I), Comparison (C) and Outcome (O). In this work, we propose a novel deep learning model for recognizing PICO elements in biomedical abstracts.
View Article and Find Full Text PDFPhenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping with EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels.
View Article and Find Full Text PDFJ Am Med Inform Assoc
November 2019
Objective: Electronic health records linked with biorepositories are a powerful platform for translational studies. A major bottleneck exists in the ability to phenotype patients accurately and efficiently. The objective of this study was to develop an automated high-throughput phenotyping method integrating International Classification of Diseases (ICD) codes and narrative data extracted using natural language processing (NLP).
View Article and Find Full Text PDFThe onset of acute kidney injury (AKI) during an intensive care unit (ICU) admission is associated with increased morbidity and mortality. Developing novel methods to identify early AKI onset is of critical importance in preventing or reducing AKI complications. We built and applied multiple machine learning models to integrate clinical notes and structured physiological measurements and estimate the risk of new AKI onset using the MIMIC-III database.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
December 2021
Gene expression data can offer deep, physiological insights beyond the static coding of the genome alone. We believe that realizing this potential requires specialized, high-capacity machine learning methods capable of using underlying biological structure, but the development of such models is hampered by the lack of published benchmark tasks and well characterized baselines. In this work, we establish such benchmarks and baselines by profiling many classifiers against biologically motivated tasks on two curated views of a large, public gene expression dataset (the LINCS corpus) and one privately produced dataset.
View Article and Find Full Text PDFBackground: We re-analyzed data from the Systolic Blood Pressure Intervention Trial (SPRINT) trial to identify features of systolic blood pressure (SBP) variability that portend poor cardiovascular outcomes using a nonlinear machine-learning algorithm.
Methods: We included all patients who completed 1 year of the study without reaching any primary endpoint during the first year, specifically: myocardial infarction, other acute coronary syndromes, stroke, heart failure or death from a cardiovascular event ( = 8799; 94%). In addition to clinical variables, features representing longitudinal SBP trends and variability were determined and combined in a random forest algorithm, optimized using cross-validation, using 70% of patients in the training set.
AMA J Ethics
February 2019
Background: As machine learning becomes increasingly common in health care applications, concerns have been raised about bias in these systems' data, algorithms, and recommendations. Simply put, as health care improves for some, it might not improve for all.
Methods: Two case studies are examined using a machine learning algorithm on unstructured clinical and psychiatric notes to predict intensive care unit (ICU) mortality and 30-day psychiatric readmission with respect to race, gender, and insurance payer type as a proxy for socioeconomic status.
We develop an algorithm for probabilistic linkage of de-identified research datasets at the patient level, when only diagnosis codes with discrepancies and no personal health identifiers such as name or date of birth are available. It relies on Bayesian modelling of binarized diagnosis codes, and provides a posterior probability of matching for each patient pair, while considering all the data at once. Both in our simulation study (using an administrative claims dataset for data generation) and in two real use-cases linking patient electronic health records from a large tertiary care network, our method exhibits good performance and compares favourably to the standard baseline Fellegi-Sunter algorithm.
View Article and Find Full Text PDFProceedings (IEEE Int Conf Bioinformatics Biomed)
December 2018
This paper presents a Lisp architecture for a portable NLP system, termed LAPNLP, for processing clinical notes. LAPNLP integrates multiple standard, customized and in-house developed NLP tools. Our system facilitates portability across different institutions and data systems by incorporating an enriched Common Data Model (CDM) to standardize necessary data elements.
View Article and Find Full Text PDFAMIA Jt Summits Transl Sci Proc
May 2018
Electronic Health Records (EHRs) have seen a rapid increase in adoption during the last decade. The narrative prose contained in clinical notes is unstructured and unlocking its full potential has proved challenging. Many studies incorporating clinical notes have applied simple information extraction models to build representations that enhance a downstream clinical prediction task, such as mortality or readmission.
View Article and Find Full Text PDF