Background And Objectives: Medical notes are narratives that describe the health of the patient in free text format. These notes can be more informative than structured data such as the history of medications or disease conditions. They are routinely collected and can be used to evaluate the patient's risk for developing chronic diseases such as dementia. This study investigates different methodologies for transforming routine care notes into dementia risk classifiers and evaluates the generalizability of these classifiers to new patients and new health care institutions.
Methods: The notes collected over the relevant history of the patient are lengthy. In this study, TF-ICF is used to select keywords with the highest discriminative ability between at risk dementia patients and healthy controls. The medical notes are then summarized in the form of occurrences of the selected keywords. Two different encodings of the summary are compared. The first encoding consists of the average of the vector embedding of each keyword occurrence as produced by the BERT or Clinical BERT pre-trained language models. The second encoding aggregates the keywords according to UMLS concepts and uses each concept as an exposure variable. For both encodings, misspellings of the selected keywords are also considered in an effort to improve the predictive performance of the classifiers. A neural network is developed over the first encoding and a gradient boosted trees model is applied to the second encoding. Patients from a single health care institution are used to develop all the classifiers which are then evaluated on held-out patients from the same health care institution as well as test patients from two other health care institutions.
Results: The results indicate that it is possible to identify patients at risk for dementia one year ahead of the onset of the disease using medical notes with an AUC of 75% when a gradient boosted trees model is used in conjunction with exposure variables derived from UMLS concepts. However, this performance is not maintained with an embedded feature space and when the classifier is applied to patients from other health care institutions. Moreover, an analysis of the top predictors of the gradient boosted trees model indicates that different features inform the classification depending on whether or not spelling variants of the keywords are included.
Conclusion: The present study demonstrates that medical notes can enable risk prediction models for complex chronic diseases such as dementia. However, additional research efforts are needed to improve the generalizability of these models. These efforts should take into consideration the length and localization of the medical notes; the availability of sufficient training data for each disease condition; and the variabilities resulting from different feature engineering techniques.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10068125 | PMC |
http://dx.doi.org/10.1016/j.heliyon.2023.e14636 | DOI Listing |
Trials
January 2025
Internal Medicine (Rheumatology), Academic Hospital, Istanbul, Turkey.
Background: It was our impression that safety outcome trials were getting more frequent, raising ethical issues mainly related to patient autonomy. We and others had also proposed this autonomy would be best served if wording of the informed consents would be in the public domain.
Methods: Initially two observers and an arbiter tabulated the main aims of randomized controlled trials (RCTs) published in 1990-1991 vs.
Metabolomics
January 2025
Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
Background: Gestational exposure to non-persistent endocrine-disrupting chemicals (EDCs) may be associated with adverse pregnancy outcomes. While many EDCs affect the endocrine system, their effects on endocrine-related metabolic pathways remain unclear. This study aims to explore the global metabolome changes associated with EDC biomarkers at delivery.
View Article and Find Full Text PDFHeart Lung Circ
January 2025
Department of Cardiology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China. Electronic address:
Aim: Regulatory T cells (Tregs) play a crucial role in the development and progression of atherosclerosis. However, the specific association between Treg immune traits and atherosclerosis and related cardiovascular diseases remains unclear, impeding their potential for clinical therapeutic application.
Method: Fifty-eight Treg-related immune traits were obtained from the latest summary level genome-wide association study, which included 3,757 individuals from Sardinia.
J AAPOS
January 2025
University of Health Sciences, Department of Ophthalmology, Başakşehir Cam and Sakura City Hospital, Istanbul, Turkey.
Purpose: To use swept-source optical coherence tomography angiography (SS-OCTA) to investigate the alterations in retinal vascular density (VD) in patients presenting with congenital unilateral trochlear nerve palsy.
Methods: The medical records of patients diagnosed with congenital unilateral trochlear nerve palsy and those of a healthy control group were reviewed retrospectively. Comprehensive ocular examinations and SS-OCTA imaging were conducted.
J Biomed Inform
January 2025
Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, 02115, MA, USA; VA Boston Healthcare System, 150 S Huntington Ave, Boston, 02130, MA, USA. Electronic address:
Objective: Electronic health record (EHR) systems contain a wealth of clinical data stored as both codified data and free-text narrative notes (NLP). The complexity of EHR presents challenges in feature representation, information extraction, and uncertainty quantification. To address these challenges, we proposed an efficient Aggregated naRrative Codified Health (ARCH) records analysis to generate a large-scale knowledge graph (KG) for a comprehensive set of EHR codified and narrative features.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!