Electronic health records (EHR) are collected as a routine part of healthcare delivery, and have great potential to be utilized to improve patient health outcomes. They contain multiple years of health information to be leveraged for risk prediction, disease detection, and treatment evaluation. However, they do not have a consistent, standardized format across institutions, particularly in the United States, and can present significant analytical challenges- they contain multi-scale data from heterogeneous domains and include both structured and unstructured data. Data for individual patients are collected at irregular time intervals and with varying frequencies. In addition to the analytical challenges, EHR can reflect inequity- patients belonging to different groups will have differing amounts of data in their health records. Many of these issues can contribute to biased data collection. The consequence is that the data for under-served groups may be less informative partly due to more fragmented care, which can be viewed as a type of missing data problem. For EHR data in this complex form, there is currently no framework for introducing realistic missing values. There has also been little to no work in assessing the impact of missing data in EHR. In this work, we first introduce a terminology to define three levels of EHR data and then propose a novel framework for simulating realistic missing data scenarios in EHR to adequately assess their impact on predictive modeling. We incorporate the use of a medical knowledge graph to capture dependencies between medical events to create a more realistic missing data framework. In an intensive care unit setting, we found that missing data have greater negative impact on the performance of disease prediction models in groups that tend to have less access to healthcare, or seek less healthcare. We also found that the impact of missing data on disease prediction models is stronger when using the knowledge graph framework to introduce realistic missing values as opposed to random event removal.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10391553 | PMC |
http://dx.doi.org/10.1016/j.jbi.2022.104269 | DOI Listing |
Adv Appl Bioinform Chem
January 2025
Department of Information Technology, Mutah University, Al-Karak, Jordan.
Purpose: The incidence of cancer, which is a serious public health concern, is increasing. A predictive analysis driven by machine learning was integrated with haematology parameters to create a method for the simultaneous diagnosis of several malignancies at different stages.
Patients And Methods: We analysed a newly collected dataset from various hospitals in Jordan comprising 19,537 laboratory reports (6,280 cancer and 13,257 noncancer cases).
Open Forum Infect Dis
January 2025
Harvard Medical School, Boston, Massachusetts, USA.
Background: Infections by and influenza viruses are vaccine-preventable diseases causing great morbidity and mortality. We evaluated pneumococcal and influenza vaccination practices during pre-international travel health consultations.
Methods: We evaluated data on pretravel visits over a 10-year period (1 July 2012 through 31 June 2022) from 31 sites in Global TravEpiNet (GTEN), a consortium of US healthcare facilities providing pretravel health consultations.
Front Antibiot
March 2024
Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy.
Antimicrobial resistance in bacteria has been associated with significant morbidity and mortality in hospitalized patients. In the era of big data and of the consequent frequent need for large study populations, manual collection of data for research studies on antimicrobial resistance and antibiotic use has become extremely time-consuming and sometimes impossible to be accomplished by overwhelmed healthcare personnel. In this review, we discuss relevant concepts pertaining to the automated extraction of antibiotic resistance and antibiotic prescription data from laboratory information systems and electronic health records to be used in clinical studies, starting from the currently available literature on the topic.
View Article and Find Full Text PDFTransl Androl Urol
December 2024
Department of Urology and Andrology Laboratory, West China Hospital, Sichuan University, Chengdu, China.
Background: The global prevalence of lower urinary tract symptoms suggestive of benign prostate hyperplasia (LUTS/BPH) escalates, with obesity recognized as a major contributing factor. However, the association between the relative fat mass (RFM) and LUTS/BPH remains unexplored. This 7-year follow-up study aimed to investigate the cross-sectional and longitudinal relationships between RFM and LUTS/BPH.
View Article and Find Full Text PDFNarra J
December 2024
Master Program in Smart Healthcare Management (SHM), International College of Sustainability Innovations, National Taipei University, New Taipei City, Taiwan.
Cognitive decline poses a significant challenge for the elderly population globally. The aim of this study was to determine the prevalence of cognitive function and its associated factors among the elderly in the Indonesian family life survey's fifth wave (IFLS-5) conducted from 2014 to 2015. The study included elderly individuals aged 60 and above, excluding proxy respondents and those with missing data.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!