Large language models (LLMs) are rapidly being adopted in healthcare, necessitating standardized reporting guidelines. We present transparent reporting of a multivariable model for individual prognosis or diagnosis (TRIPOD)-LLM, an extension of the TRIPOD + artificial intelligence statement, addressing the unique challenges of LLMs in biomedical applications. TRIPOD-LLM provides a comprehensive checklist of 19 main items and 50 subitems, covering key aspects from title to discussion.
View Article and Find Full Text PDFObjective: Intracranial aneurysms (IA) and aortic aneurysms (AA) are both abnormal dilations of arteries with familial predisposition and have been proposed to share co-prevalence and pathophysiology. Associations of IA and non-aortic peripheral aneurysms are less well-studied. The goal of the study was to understand the patterns of aortic and peripheral (extracranial) aneurysms in patients with IA, and risk factors associated with the development of these aneurysms.
View Article and Find Full Text PDFJ Am Med Inform Assoc
April 2024
Objective: Large language models (LLMs) have shown impressive ability in biomedical question-answering, but have not been adequately investigated for more specific biomedical applications. This study investigates ChatGPT family of models (GPT-3.5, GPT-4) in biomedical tasks beyond question-answering.
View Article and Find Full Text PDFSocial determinants of health (SDoH) play a critical role in patient outcomes, yet their documentation is often missing or incomplete in the structured data of electronic health records (EHRs). Large language models (LLMs) could enable high-throughput extraction of SDoH from the EHR to support research and clinical care. However, class imbalance and data limitations present challenges for this sparsely documented yet critical information.
View Article and Find Full Text PDFPurpose: Manual extraction of case details from patient records for cancer surveillance is a resource-intensive task. Natural Language Processing (NLP) techniques have been proposed for automating the identification of key details in clinical notes. Our goal was to develop NLP application programming interfaces (APIs) for integration into cancer registry data abstraction tools in a computer-assisted abstraction setting.
View Article and Find Full Text PDFProc Conf Assoc Comput Linguist Meet
July 2023
Understanding temporal relationships in text from electronic health records can be valuable for many important downstream clinical applications. Since Clinical TempEval 2017, there has been little work on end-to-end systems for temporal relation extraction, with most work focused on the setting where gold standard events and time expressions are given. In this work, we make use of a novel multi-headed attention mechanism on top of a pre-trained transformer encoder to allow the learning process to attend to multiple aspects of the contextualized embeddings.
View Article and Find Full Text PDFObjective: The classification of clinical note sections is a critical step before doing more fine-grained natural language processing tasks such as social determinants of health extraction and temporal information extraction. Often, clinical note section classification models that achieve high accuracy for 1 institution experience a large drop of accuracy when transferred to another institution. The objective of this study is to develop methods that classify clinical note sections under the SOAP ("Subjective," "Object," "Assessment," and "Plan") framework with improved transferability.
View Article and Find Full Text PDFPurpose: Radiotherapy (RT) toxicities can impair survival and quality of life, yet remain understudied. Real-world evidence holds potential to improve our understanding of toxicities, but toxicity information is often only in clinical notes. We developed natural language processing (NLP) models to identify the presence and severity of esophagitis from notes of patients treated with thoracic RT.
View Article and Find Full Text PDFPurpose: There is an unmet need to empirically explore and understand drivers of cancer disparities, particularly social determinants of health. We explored natural language processing methods to automatically and empirically extract clinical documentation of social contexts and needs that may underlie disparities.
Methods: This was a retrospective analysis of 230,325 clinical notes from 5,285 patients treated with radiotherapy from 2007 to 2019.
Objective: The manual extraction of case details from patient records for cancer surveillance efforts is a resource-intensive task. Natural Language Processing (NLP) techniques have been proposed for automating the identification of key details in clinical notes. Our goal was to develop NLP application programming interfaces (APIs) for integration into cancer registry data abstraction tools in a computer-assisted abstraction setting.
View Article and Find Full Text PDFObjective: The classification of clinical note sections is a critical step before doing more fine-grained natural language processing tasks such as social determinants of health extraction and temporal information extraction. Often, clinical note section classification models that achieve high accuracy for one institution experience a large drop of accuracy when transferred to another institution. The objective of this study is to develop methods that classify clinical note sections under the SOAP ("Subjective", "Object", "Assessment" and "Plan") framework with improved transferability.
View Article and Find Full Text PDFPurpose: Real-world evidence for radiation therapy (RT) is limited because it is often documented only in the clinical narrative. We developed a natural language processing system for automated extraction of detailed RT events from text to support clinical phenotyping.
Methods And Materials: A multi-institutional data set of 96 clinician notes, 129 North American Association of Central Cancer Registries cancer abstracts, and 270 RT prescriptions from HemOnc.
Objectives: The pathogenesis of intracranial aneurysms is multifactorial and includes genetic, environmental, and anatomic influences. We aimed to identify image-based morphological parameters that were associated with middle cerebral artery (MCA) bifurcation aneurysms.
Materials And Methods: We evaluated three-dimensional morphological parameters obtained from CT angiography (CTA) or digital subtraction angiography (DSA) from 317 patients with unilateral MCA bifurcation aneurysms diagnosed at the Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016.
Background: The National Cancer Institute Informatics Technology for Cancer Research (ITCR) program provides a series of funding mechanisms to create an ecosystem of open-source software (OSS) that serves the needs of cancer research. As the ITCR ecosystem substantially grows, it faces the challenge of the long-term sustainability of the software being developed by ITCR grantees. To address this challenge, the ITCR sustainability and industry partnership working group (SIP-WG) was convened in 2019.
View Article and Find Full Text PDFWe present a cohort of patients with anterior communicating artery (ACoA) aneurysms to investigate morphological characteristics and clinical factors associated with rupture of the aneurysms. 505 patients with ACoA aneurysms were identified at the Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016, with available CT angiography (CTA). Three-dimensional (3D) reconstructions were performed to evaluate aneurysmal morphologic features, including location, projection, irregularity, the presence of daughter dome, height, height/width ratio, and relationships between surrounding vessels.
View Article and Find Full Text PDFInt J Radiat Oncol Biol Phys
July 2021
Natural language processing (NLP), which aims to convert human language into expressions that can be analyzed by computers, is one of the most rapidly developing and widely used technologies in the field of artificial intelligence. Natural language processing algorithms convert unstructured free text data into structured data that can be extracted and analyzed at scale. In medicine, this unlocking of the rich, expressive data within clinical free text in electronic medical records will help untap the full potential of big data for research and clinical purposes.
View Article and Find Full Text PDFMorphological factors of intracranial aneurysms and the surrounding vasculature could affect aneurysm rupture risk in a location specific manner. Our goal was to identify image-based morphological parameters that correlated with ruptured basilar tip aneurysms. Three-dimensional morphological parameters obtained from CT-angiography (CTA) or digital subtraction angiography (DSA) from 200 patients with basilar tip aneurysms diagnosed at the Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016 were evaluated.
View Article and Find Full Text PDFBackground: Hemodynamic stress, conditioned by the morphology of the surrounding vasculature, plays an important role in aneurysm formation. Our goal was to identify image-based location-specific parameters that are associated with posterior communicating artery (PCoA) aneurysms.
Methods: Three-dimensional morphological parameters obtained from CT angiography or digital subtraction angiography from 187 patients with unilateral PCoA aneurysms, diagnosed at the Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016, were evaluated.
Objective: To identify clinical and morphologic risk factors correlated with anterior communicating artery (ACoA) aneurysm formation.
Methods: Three-dimensional morphologic parameters obtained from computed tomography angiography or digital subtraction angiography from 504 patients with ACoA aneurysms and 201 patients with aneurysms in other locations that were diagnosed at Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016 were evaluated. The presence of hypoplastic and aplastic A1 segments and diameters and angles of surrounding parent and daughter vessels were examined.
Objective: To advance use of real-world data (RWD) for pharmacovigilance, we sought to integrate a high-sensitivity natural language processing (NLP) pipeline for detecting potential adverse drug events (ADEs) with easily interpretable output for high-efficiency human review and adjudication of true ADEs.
Materials And Methods: The adverse drug event presentation and tracking (ADEPT) system employs an open source NLP pipeline to identify in clinical notes mentions of medications and signs and symptoms potentially indicative of ADEs. ADEPT presents the output to human reviewers by highlighting these drug-event pairs within the context of the clinical note.