Publications by authors named "Michael Steinbach"

Objective: Hospital-acquired infections (HAIs) are associated with significant morbidity, mortality, and prolonged hospital length of stay. Risk prediction models based on pre- and intraoperative data have been proposed to assess the risk of HAIs at the end of the surgery, but the performance of these models lag behind HAI detection models based on postoperative data. Postoperative data are more predictive than pre- or interoperative data since it is closer to the outcomes in time, but it is unavailable when the risk models are applied (end of surgery).

View Article and Find Full Text PDF

Objective: The association of body mass index (BMI) and all-cause mortality is controversial, frequently referred to as a paradox. Whether the cause is metabolic factors or statistical biases is still controversial. We assessed the association of BMI and all-cause mortality considering a wide range of comorbidities and baseline mortality risk.

View Article and Find Full Text PDF

Diseases can show different courses of progression even when patients share the same risk factors. Recent studies have revealed that the use of trajectories, the order in which diseases manifest throughout life, can be predictive of the course of progression. In this study, we propose a novel computational method for learning disease trajectories from EHR data.

View Article and Find Full Text PDF

Our aging population increasingly suffers from multiple chronic diseases simultaneously, necessitating the comprehensive treatment of these conditions. Finding the optimal set of drugs for a combinatorial set of diseases is a combinatorial pattern exploration problem. Association rule mining is a popular tool for such problems, but the requirement of health care for finding causal, rather than associative, patterns renders association rule mining unsuitable.

View Article and Find Full Text PDF

Importance: Clinical domain knowledge about diseases and their comorbidities, severity, treatment pathways, and outcomes can facilitate diagnosis, enhance preventive strategies, and help create smart evidence-based practice guidelines.

Objective: To introduce a new representation of patient data called disease severity hierarchy that leverages domain knowledge in a nested fashion to create subpopulations that share increasing amounts of clinical details suitable for risk prediction.

Design, Setting, And Participants: This retrospective cohort study included 51 969 patients aged 45 to 85 years, with 10 674 patients who received primary care at the Mayo Clinic between January 2004 and December 2015 in the training cohort and 41 295 patients who received primary care at Fairview Health Services from January 2010 to December 2017 in the validation cohort.

View Article and Find Full Text PDF

Background: The ubiquity of electronic health records (EHR) offers an opportunity to observe trajectories of laboratory results and vital signs over long periods of time. This study assessed the value of risk factor trajectories available in the electronic health record to predict incident type 2 diabetes.

Study Design And Methods: Analysis was based on a large 13-year retrospective cohort of 71,545 adult, non-diabetic patients with baseline in 2005 and median follow-up time of 8 years.

View Article and Find Full Text PDF

Genetic interactions have been reported to underlie phenotypes in a variety of systems, but the extent to which they contribute to complex disease in humans remains unclear. In principle, genome-wide association studies (GWAS) provide a platform for detecting genetic interactions, but existing methods for identifying them from GWAS data tend to focus on testing individual locus pairs, which undermines statistical power. Importantly, a global genetic network mapped for a model eukaryotic organism revealed that genetic interactions often connect genes between compensatory functional modules in a highly coherent manner.

View Article and Find Full Text PDF

Different analytic techniques operate optimally with different types of data. As the use of EHR-based analytics expands to newer tasks, data will have to be transformed into different representations, so the tasks can be optimally solved. We classified representations into broad categories based on their characteristics, and proposed a new knowledge-driven representation for clinical data mining as well as trajectory mining, called Severity Encoding Variables (SEVs).

View Article and Find Full Text PDF

The ability to assess data quality is essential for secondary use of EHR data and an automated Healthcare Data Quality Framework (HDQF) can be used as a tool to support a healthcare organization's data quality initiatives. Use of a general purpose HDQF provides a method to assess and visualize data quality to quickly identify areas for improvement. The value of the approach is illustrated for two analytics use cases: 1) predictive models and 2) clinical quality measures.

View Article and Find Full Text PDF

In recent years, the emerging field of computational psychiatry has impelled the use of machine learning models as a means to further understand the pathogenesis of multiple clinical disorders. In this paper, we discuss how autism spectrum disorder (ASD) was and continues to be diagnosed in the context of its complex neurodevelopmental heterogeneity. We review machine learning approaches to streamline ASD's diagnostic methods, to discern similarities and differences from comorbid diagnoses, and to follow developmentally variable outcomes.

View Article and Find Full Text PDF

The true onset time of a disease, particularly slow-onset diseases like Type 2 diabetes mellitus (T2DM), is rarely observable in electronic health records (EHRs). However, it is critical for analysis of time to events and for studying sequences of diseases. The aim of this study is to demonstrate a method for estimating the onset time of such diseases from intermittently observable laboratory results in the specific context of T2DM.

View Article and Find Full Text PDF

Objectives: To specify when delays of specific 3-hour bundle Surviving Sepsis Campaign guideline recommendations applied to severe sepsis or septic shock become harmful and impact mortality.

Design: Retrospective cohort study.

Setting: One health system composed of six hospitals and 45 clinics in a Midwest state from January 01, 2011, to July 31, 2015.

View Article and Find Full Text PDF

Unlabelled: To conduct an independent secondary analysis of a multi-focal intervention for early detection of sepsis that included implementation of change management strategies, electronic surveillance for sepsis, and evidence based point of care alerting using the POC AdvisorTM application.

Methods: Propensity score matching was used to select subsets of the cohorts with balanced covariates. Bootstrapping was performed to build distributions of the measured difference in rates/means.

View Article and Find Full Text PDF

Sepsis incidents have doubled from 2000 through 2008, and hospitalizations for these diagnoses have increased by 70%. The use of the Surviving Sepsis Campaign (SSC) guidelines can lead to earlier diagnosis and treatment; however, the effectiveness of the SSC guidelines in preventing complications for this population is unclear. The overall purpose of this study was to apply SSC guideline recommendations to EHR data for patients with severe sepsis or septic shock and determine guideline compliance as well as its impact on inpatient mortality and sepsis complications.

View Article and Find Full Text PDF

Disease progression models, statistical models that assess a patient's risk of diabetes progression, are popular tools in clinical practice for prevention and management of chronic conditions. Most, if not all, models currently in use are based on gold standard clinical trial data. The relatively small sample size available from clinical trial limits these models only considering the patient's state at the time of the assessment and ignoring the trajectory, the sequence of events, that led up to the state.

View Article and Find Full Text PDF

Pulmonary complications due to infection and idiopathic pneumonia syndrome (IPS), a noninfectious lung injury in hematopoietic stem cell transplant (HSCT) recipients, are frequent causes of transplantation-related mortality and morbidity. Our objective was to characterize the global bronchoalveolar lavage fluid (BALF) protein expression of IPS to identify proteins and pathways that differentiate IPS from infectious lung injury after HSCT. We studied 30 BALF samples from patients who developed lung injury within 180 days of HSCT or cellular therapy transfusion (natural killer cell transfusion).

View Article and Find Full Text PDF

Background: Mobility is critical for self-management. Understanding factors associated with improvement in mobility during home healthcare can help nurses tailor interventions to improve mobility outcomes and keep patients safely at home.

Objectives: The aims were to (a) identify patient and support system factors associated with mobility improvement during home care, (b) evaluate consistency of factors across groups defined by mobility status at the start of home care, and (c) identify patterns of factors associated with improvement and no improvement in mobility within each group.

View Article and Find Full Text PDF

Unlabelled: Non-small cell lung cancers (NSCLC) harbor thousands of passenger events that hide genetic drivers. Even highly recurrent events in NSCLC, such as mutations in PTEN, EGFR, KRAS, and ALK, are detected, at most, in only 30% of patients. Thus, many unidentified low-penetrant events are causing a significant portion of lung cancers.

View Article and Find Full Text PDF

In this manuscript, we present connectivity cluster analysis (CoCA), a novel computational framework that takes advantage of structure of the brain networks to magnify reproducible signals and quash noise. Resting state functional Magnetic Resonance Imaging (fMRI) data that is used in estimating functional brain networks is often noisy, leading to reduced power and inconsistent findings across independent studies. There is a need for techniques that can unearth signals in noisy datasets, while addressing redundancy in the functional connections that are used for testing association.

View Article and Find Full Text PDF

Acute Respiratory Distress Syndrome (ARDS) continues to have a high mortality. Currently, there are no biomarkers that provide reliable prognostic information to guide clinical management or stratify risk among clinical trial participants. The objective of this study was to probe the bronchoalveolar lavage fluid (BALF) proteome to identify proteins that differentiate survivors from non-survivors of ARDS.

View Article and Find Full Text PDF

Histiocytic sarcoma is a rare, aggressive neoplasm that responds poorly to therapy. Histiocytic sarcoma is thought to arise from macrophage precursor cells via genetic changes that are largely undefined. To improve our understanding of the etiology of histiocytic sarcoma we conducted a forward genetic screen in mice using the Sleeping Beauty transposon as a mutagen to identify genetic drivers of histiocytic sarcoma.

View Article and Find Full Text PDF

Neuroimaging research indicates that human intellectual ability is related to brain structure including the thickness of the cerebral cortex. Most studies indicate that general intelligence is positively associated with cortical thickness in areas of association cortex distributed throughout both brain hemispheres. In this study, we performed a cortical thickness mapping analysis on data from 182 healthy typically developing males and females ages 9 to 24 years to identify correlates of general intelligence () scores.

View Article and Find Full Text PDF

Background: Personal genome assembly is a critical process when studying tumor genomes and other highly divergent sequences. The accuracy of downstream analyses, such as RNA-seq and ChIP-seq, can be greatly enhanced by using personal genomic sequences rather than standard references. Unfortunately, reads sequenced from these types of samples often have a heterogeneous mix of various subpopulations with different variants, making assembly extremely difficult using existing assembly tools.

View Article and Find Full Text PDF

Collections of tumor genomes created by insertional mutagenesis experiments, e.g., the Retroviral Tagged Cancer Gene Database (RTCGD), can be analysed to find connections between mutations of specific genes and cancer.

View Article and Find Full Text PDF

Neuropsychiatric disorders such as schizophrenia, bipolar disorder and Alzheimer's disease are major public health problems. However, despite decades of research, we currently have no validated prognostic or diagnostic tests that can be applied at an individual patient level. Many neuropsychiatric diseases are due to a combination of alterations that occur in a human brain rather than the result of localized lesions.

View Article and Find Full Text PDF