Publications by authors named "Ryan Urbanowicz"

We leveraged electronic health record (EHR) data from the Accelerating Data Value Across a National Community Health Center Network (ADVANCE) Clinical Research Network (CRN) to identify social risk factor clusters, assess their association with obstructive sleep apnea (OSA), and determine relevant clinical predictors of cardiovascular (CV) outcomes among those experiencing OSA. Geographically informed social indicators were used to define social risk factor clusters via latent class analysis. EHR-wide diagnoses were used as predictors of 5-year incidence of major adverse CV events (MACE) using STREAMLINE, an end-to-end rigorous and interpretable automated machine learning pipeline.

View Article and Find Full Text PDF

Background: Epistasis, the interaction between genetic loci where the effect of one locus is influenced by one or more other loci, plays a crucial role in the genetic architecture of complex traits. However, as the number of loci considered increases, the investigation of epistasis becomes exponentially more complex, making the selection of key features vital for effective downstream analyses. Relief-Based Algorithms (RBAs) are often employed for this purpose due to their reputation as "interaction-sensitive" algorithms and uniquely non-exhaustive approach.

View Article and Find Full Text PDF

Background: The investigation of epistasis becomes increasingly complex as more loci are considered due to the exponential expansion of possible interactions. Consequently, selecting key features that influence epistatic interactions is crucial for effective downstream analyses. Recognizing this challenge, this study investigates the efficiency of Relief-Based Algorithms (RBAs) in detecting higher-order epistatic interactions, which may be critical for understanding the genetic architecture of complex traits.

View Article and Find Full Text PDF
Article Synopsis
  • The mpox outbreak in the U.S. led to over 32,000 cases and 58 deaths from May 2022 to March 2024, raising concerns about stigma and access to healthcare for sexual minority men and gender-diverse individuals.
  • To address the lack of SMMGD perspectives in existing literature, this study aimed to gather their views on public health communication regarding mpox, focusing on inclusivity and equity.
  • An analysis of 8,688 mpox-related tweets from SMMGD users identified 11 key discussion topics, with significant focus on health activism and vaccination discussions, as well as the impact of COVID-19 and public health responses.
View Article and Find Full Text PDF
Article Synopsis
  • The authors talk about how important it is to include everyone, especially LGBTQ+ people, in science and technology education and AI research.
  • They point out the problems that queer scientists face and how better educational resources can help them.
  • The authors want to create a supportive environment where everyone can work together respectfully, no matter their background.
View Article and Find Full Text PDF

Objectives: To synthesize discussions among sexual minority men and gender diverse (SMMGD) individuals on mpox, given limited representation of SMMGD voices in existing mpox literature.

Methods: BERTopic (a topic modeling technique) was employed with human validations to analyze mpox-related tweets ( = 8,688; October 2020-September 2022) from 2,326 self-identified SMMGD individuals in the U.S.

View Article and Find Full Text PDF

According to the World Stroke Organization, 12.2 million people world-wide will have their first stroke this year almost half of which will die as a result. Natural Language Processing (NLP) may improve stroke phenotyping; however, existing rule-based classifiers are rigid, resulting in inadequate performance.

View Article and Find Full Text PDF

Interrogating plasma cell-free DNA (cfDNA) to detect cancer offers promise; however, no current tests scan structural variants (SVs) throughout the genome. Here, we report a simple molecular workflow to enrich a tumorigenic SV (DNA palindromes/fold-back inversions) that often demarcates genomic amplification and its feasibility for cancer detection by combining low-throughput next-generation sequencing with automated machine learning (Genome-wide Analysis of Palindrome Formation, GAPF-seq). Tumor DNA signal manifested as skewed chromosomal distributions of high-coverage 1-kb bins (HCBs), differentiating 39 matched breast tumor DNA from normal DNA with an average AUC of 0.

View Article and Find Full Text PDF

Plasma cell-free DNA (cfDNA) is a promising source of gene mutations for cancer detection by liquid biopsy. However, no current tests interrogate chromosomal structural variants (SVs) genome-wide. Here, we report a simple molecular and sequencing workflow called Genome-wide Analysis of Palindrome Formation (GAPF-seq) to probe DNA palindromes, a type of SV that often demarcates gene amplification.

View Article and Find Full Text PDF

Supply-demand mismatch of ward resources ("ward capacity strain") alters care and outcomes. Narrow strain definitions and heterogeneous populations limit strain literature. Evaluate the predictive utility of a large set of candidate strain variables for in-hospital mortality and discharge destination among acute respiratory failure (ARF) survivors.

View Article and Find Full Text PDF
Article Synopsis
  • The introduction of large language models (LLMs) represents a significant change in how we generate text, allowing for human-like chat interactions.
  • LLM-based chatbots can enhance academic efficiency, but ethical issues like fair use and biases need to be addressed.
  • The editorial emphasizes the importance of effective usage, distinguishes between LLM use and plagiarism, calls for addressing bias and accuracy concerns, and highlights a promising future for LLM applications in academia.
View Article and Find Full Text PDF

STREAMLINE is a simple, transparent, end-to-end automated machine learning (AutoML) pipeline for easily conducting rigorous machine learning (ML) modeling and analysis. The initial version is limited to binary classification. In this work, we extend STREAMLINE through implementing multiple regression-based ML models, including linear regression, elastic net, group lasso, and L21 norm.

View Article and Find Full Text PDF

Amyloid imaging has been widely used in Alzheimer's disease (AD) diagnosis and biomarker discovery through detecting the regional amyloid plaque density. It is essential to be normalized by a reference region to reduce noise and artifacts. To explore an optimal normalization strategy, we employ an automated machine learning (AutoML) pipeline, STREAMLINE, to conduct the AD diagnosis binary classification and perform permutation-based feature importance analysis with thirteen machine learning models.

View Article and Find Full Text PDF
Article Synopsis
  • The study investigates whether short-term risk scores for pulmonary arterial hypertension (PAH) can predict long-term outcomes like clinical worsening and mortality in patients.
  • It uses data from three randomized clinical trials, analyzing risk assessments from various PAH risk score tools and their correlation with adverse clinical events.
  • Results indicated significant insights into the potential of these risk scores as surrogates for predicting long-term survival and health deterioration in PAH patients.
View Article and Find Full Text PDF

Background: It is currently unknown if disease severity modifies response to therapy in pulmonary arterial hypertension (PAH). We aimed to explore if disease severity, as defined by established risk-prediction algorithms, modified response to therapy in randomised clinical trials in PAH.

Methods: We performed a meta-analysis using individual participant data from 18 randomised clinical trials of therapy for PAH submitted to the United States Food and Drug Administration to determine if predicted risk of 1-year mortality at randomisation modified the treatment effect on three outcomes: change in 6-min walk distance (6MWD), clinical worsening at 12 weeks and time to clinical worsening.

View Article and Find Full Text PDF

Our objective was to detect common barriers to post-acute care (B2PAC) among hospitalized older adults using natural language processing (NLP) of clinical notes from patients discharged home when a clinical decision support system recommended post-acute care. We annotated B2PAC sentences from discharge planning notes and developed an NLP classifier to identify the highest-value B2PAC class (negative patient preferences). Thirteen machine learning models were compared with Amazon's AutoGluon deep learning model.

View Article and Find Full Text PDF
Article Synopsis
  • This study investigates the impact of HLA amino acid-level mismatches (AA-MM) on kidney transplant success, focusing on how they differ from traditional HLA antigen-level mismatches (Ag-MM).
  • Researchers developed a tool called Feature Inclusion Bin Evolver for Risk Stratification (FIBERS) to evaluate and classify donor-recipient pairs into risk categories based on AA-MMs.
  • Results showed that using FIBERS provided better predictive power for graft failure risk, identifying more patients as low-risk compared to traditional methods, particularly highlighting the importance of mismatches in the DRB1 locus.
View Article and Find Full Text PDF

Purpose: Predicting 30-day readmission risk is paramount to improving the quality of patient care. In this study, we compare sets of patient-, provider-, and community-level variables that are available at two different points of a patient's inpatient encounter (first 48 hours and the full encounter) to train readmission prediction models and identify possible targets for appropriate interventions that can potentially reduce avoidable readmissions.

Methods: Using electronic health record data from a retrospective cohort of 2,460 oncology patients and a comprehensive machine learning analysis pipeline, we trained and tested models predicting 30-day readmission on the basis of data available within the first 48 hours of admission and from the entire hospital encounter.

View Article and Find Full Text PDF

Sex-based differences in pulmonary arterial hypertension (PAH) are known, but the contribution to disease measures is understudied. We examined whether sex was associated with baseline 6-minute-walk distance (6MWD), hemodynamics, and functional class. We conducted a secondary analysis of participant-level data from randomized clinical trials of investigational PAH therapies conducted between 1998 and 2014 and provided by the U.

View Article and Find Full Text PDF

Genetic heterogeneity describes the occurrence of the same or similar phenotypes through different genetic mechanisms in different individuals. Robustly characterizing and accounting for genetic heterogeneity is crucial to pursuing the goals of precision medicine, for discovering novel disease biomarkers, and for identifying targets for treatments. Failure to account for genetic heterogeneity may lead to missed associations and incorrect inferences.

View Article and Find Full Text PDF

Background: Obesity is increasingly prevalent in pulmonary arterial hypertension (PAH) but is associated with improved survival, creating an "obesity paradox" in PAH. It is unknown if the improved outcomes could be attributable to obese patients deriving a greater benefit from PAH therapies.

Research Question: Does BMI modify treatment effectiveness in PAH?

Study Design And Methods: Using individual participant data, a meta-analysis was conducted of phase III, randomized, placebo-controlled trials of treatments for PAH submitted for approval to the U.

View Article and Find Full Text PDF

Background: Gene set enrichment analysis (GSEA) uses gene-level univariate associations to identify gene set-phenotype associations for hypothesis generation and interpretation. We propose that GSEA can be adapted to incorporate SNP and gene-level interactions. To this end, gene scores are derived by Relief-based feature importance algorithms that efficiently detect both univariate and interaction effects (MultiSURF) or exclusively interaction effects (MultiSURF*).

View Article and Find Full Text PDF

The population of patients with pulmonary arterial hypertension (PAH) has evolved over time from predominantly young White women to an older, more racially diverse and obese population. Whether these changes are reflected in clinical trials is not known. To determine secular and regional trends among PAH trial participants.

View Article and Find Full Text PDF

Objective: Data harmonization is essential to integrate individual participant data from multiple sites, time periods, and trials for meta-analysis. The process of mapping terms and phrases to an ontology is complicated by typographic errors, abbreviations, truncation, and plurality. We sought to harmonize medical history (MH) and adverse events (AE) term records across 21 randomized clinical trials in pulmonary arterial hypertension and chronic thromboembolic pulmonary hypertension.

View Article and Find Full Text PDF

Growing demand for biomedical informaticists and expertise in areas related to this discipline has accentuated the need to integrate biomedical informatics training into high school curricula. The K-12 Bioinformatics professional development project educates high school teachers about data analysis, biomedical informatics and mobile learning, and partners with them to expose high school students to health and environment-related issues using biomedical informatics knowledge and current technologies. We designed low-cost pollution sensors and created interactive web applications that teachers from six Philadelphia public high schools used during the 2019-2020 school year to successfully implement a problem-based mobile learning unit that included collecting and interpreting air pollution data, as well as relating this data to asthma.

View Article and Find Full Text PDF