We introduce a Bayesian approach for biclustering that accounts for the prior functional dependence between genes using hidden Markov models (HMMs). We utilize biological knowledge gathered from gene ontologies and the hidden Markov structure to capture the potential coexpression of neighboring genes. Our interpretable model-based clustering characterized each cluster of samples by three groups of features: overexpressed, underexpressed, and irrelevant features.
View Article and Find Full Text PDFAutism spectrum disorder (ASD) is a developmental disorder with a rising prevalence and unknown etiology presenting with deficits in cognition and abnormal behavior. We hypothesized that the investigation of the synaptic component of prefrontal cortex may provide proteomic signatures that may identify the biological underpinnings of cognitive deficits in childhood ASD. Subcellular fractions of synaptosomes from prefrontal cortices of age-, brain area-, and postmortem-interval-matched samples from children and adults with idiopathic ASD vs.
View Article and Find Full Text PDFWastewater-based surveillance has become an important tool for research groups and public health agencies investigating and monitoring the COVID-19 pandemic and other public health emergencies including other pathogens and drug abuse. While there is an emerging body of evidence exploring the possibility of predicting COVID-19 infections from wastewater signals, there remain significant challenges for statistical modeling. Longitudinal observations of viral copies in municipal wastewater can be influenced by noisy datasets and missing values with irregular and sparse samplings.
View Article and Find Full Text PDFBrain imaging and genomics are critical tools enabling characterization of the genetic basis of brain disorders. However, imaging large cohorts is expensive and may be unavailable for legacy datasets used for genome-wide association studies (GWASs). Using an integrated feature selection/aggregation model, we developed an image-mediated association study (IMAS), which utilizes borrowed imaging/genomics data to conduct association mapping in legacy GWAS cohorts.
View Article and Find Full Text PDFWastewater-based surveillance (WBS) has been established as a powerful tool that can guide health policy at multiple levels of government. However, this approach has not been well assessed at more granular scales, including large work sites such as University campuses. Between August 2021 and April 2022, we explored the occurrence of SARS-CoV-2 RNA in wastewater using qPCR assays from multiple complimentary sewer catchments and residential buildings spanning the University of Calgary's campus and how this compared to levels from the municipal wastewater treatment plant servicing the campus.
View Article and Find Full Text PDFRisk prediction models for cancer stage at diagnosis may identify individuals at higher risk of late-stage cancer diagnoses. Partial proportional odds risk prediction models for cancer stage at diagnosis for males and females were developed using data from Alberta's Tomorrow Project (ATP). Prediction models were validated on the British Columbia Generations Project (BCGP) cohort using discrimination and calibration measures.
View Article and Find Full Text PDFWastewater-based surveillance (WBS) of infectious diseases is a powerful tool for understanding community COVID-19 disease burden and informing public health policy. The potential of WBS for understanding COVID-19's impact in non-healthcare settings has not been explored to the same degree. Here we examined how SARS-CoV-2 measured from municipal wastewater treatment plants (WWTPs) correlates with workforce absenteeism.
View Article and Find Full Text PDFCoronary artery disease is one of the most common types of cardiovascular disease. Death from coronary heart disease is influenced by genetic factors in both women and men. In this article, we propose a novel Bayesian variable selection framework for the identification of important genetic variants associated with coronary artery disease disease status.
View Article and Find Full Text PDFBackground: There is still more to learn about the pathobiology of COVID-19. A multi-omic approach offers a holistic view to better understand the mechanisms of COVID-19. We used state-of-the-art statistical learning methods to integrate genomics, metabolomics, proteomics, and lipidomics data obtained from 123 patients experiencing COVID-19 or COVID-19-like symptoms for the purpose of identifying molecular signatures and corresponding pathways associated with the disease.
View Article and Find Full Text PDFIn this article, we propose a two-level copula joint model to analyze clinical data with multiple disparate continuous longitudinal outcomes and multiple event-times in the presence of competing risks. At the first level, we use a copula to model the dependence between competing latent event-times, in the process constructing the submodel for the observed event-time, and employ the Gaussian copula to construct the submodel for the longitudinal outcomes that accounts for their conditional dependence; these submodels are glued together at the second level via the Gaussian copula to construct a joint model that incorporates conditional dependence between the observed event-time and the longitudinal outcomes. To have the flexibility to accommodate skewed data and examine possibly different covariate effects on quantiles of a non-Gaussian outcome, we propose linear quantile mixed models for the continuous longitudinal data.
View Article and Find Full Text PDFObjectives: This study aims to develop and validate a Bayesian risk prediction model that combines research cohort data with elicited expert knowledge to predict dementia progression in people with mild cognitive impairment (MCI).
Study Design And Setting: This is a prognostic risk prediction modeling study based on cohort data (Alzheimer's disease neuroimaging initiative [ADNI]; n = 365) of research participants with MCI and elicited expert data. Bayesian Cox models were used to combine expert knowledge and ADNI data to predict dementia progression in people with MCI.
Wastewater-based SARS-CoV-2 surveillance enables unbiased and comprehensive monitoring of defined sewersheds. We performed real-time monitoring of hospital wastewater that differentiated Delta and Omicron variants within total SARS-CoV-2-RNA, enabling correlation to COVID-19 cases from three tertiary-care facilities with >2100 inpatient beds in Calgary, Canada. RNA was extracted from hospital wastewater between August/2021 and January/2022, and SARS-CoV-2 quantified using RT-qPCR.
View Article and Find Full Text PDFBackground: Cox proportional hazards regression models and machine learning models are widely used for predicting the risk of dementia. Existing comparisons of these models have mostly been based on empirical datasets and have yielded mixed results. This study examines the accuracy of various machine learning and of the Cox regression models for predicting time-to-event outcomes using Monte Carlo simulation in people with mild cognitive impairment (MCI).
View Article and Find Full Text PDFThe ribonucleic acid (RNA) of the severe acute respiratory syndrome coronavirus 2 (SARS-Cov-2) is detectable in municipal wastewater as infected individuals can shed the virus in their feces. Viral concentration in wastewater can inform the severity of the COVID-19 pandemic but observations can be noisy and sparse and hence hamper the epidemiological interpretation. Motivated by a Canadian nationwide wastewater surveillance data set, unlike previous studies, we propose a novel Bayesian statistical framework based on the theories of functional data analysis to tackle the challenges embedded in the longitudinal wastewater monitoring data.
View Article and Find Full Text PDFWastewater-based epidemiology (WBE) is an emerging surveillance tool that has been used to monitor the ongoing COVID-19 pandemic by tracking SARS-CoV-2 RNA shed into wastewater. WBE was performed to monitor the occurrence and spread of SARS-CoV-2 from three wastewater treatment plants (WWTP) and six neighborhoods in the city of Calgary, Canada (population 1.44 million).
View Article and Find Full Text PDFIntroduction: This study aimed to develop and validate a 3-year dementia risk score in individuals with mild cognitive impairment (MCI) based on variables collected in routine clinical care.
Methods: The prediction score was trained and developed using data from the National Alzheimer's Coordinating Center (NACC). Selection criteria included aged 55 years and older with MCI.
COVID-19 is a disease characterized by its seemingly unpredictable clinical outcomes. In order to better understand the molecular signature of the disease, a recent multi-omics study was done which looked at correlations between biomolecules and used a tree- based machine learning approach to predict clinical outcomes. This study specifically looked at patients admitted to the hospital experiencing COVID-19 or COVID-19 like symptoms.
View Article and Find Full Text PDFIntroduction: To date, there is no broadly accepted dementia risk score for use in individuals with mild cognitive impairment (MCI), partly because there are few large datasets available for model development. When evidence is limited, the knowledge and experience of experts becomes more crucial for risk stratification and providing MCI patients with prognosis. Structured expert elicitation (SEE) includes formal methods to quantify experts' beliefs and help experts to express their beliefs in a quantitative form, reducing biases in the process.
View Article and Find Full Text PDFDNA methylations in critical regions are highly involved in cancer pathogenesis and drug response. However, to identify causal methylations out of a large number of potential polymorphic DNA methylation sites is challenging. This high-dimensional data brings two obstacles: first, many established statistical models are not scalable to so many features; second, multiple-test and overfitting become serious.
View Article and Find Full Text PDFThe problem of associating data from multiple sources and predicting an outcome simultaneously is an important one in modern biomedical research. It has potential to identify multidimensional array of variables predictive of a clinical outcome and to enhance our understanding of the pathobiology of complex diseases. Incorporating functional knowledge in association and prediction models can reveal pathways contributing to disease risk.
View Article and Find Full Text PDFIn cancer radiomics, textural features evaluated from image intensity-derived gray-level co-occurrence matrices (GLCMs) have been studied to evaluate gray-level spatial dependence within the regions of interest in the brain. Most of these analysis work with summary statistics (or texture-based features) constructed using the GLCM entries, and potentially overlook other structural properties in the GLCM. In our proposed Bayesian framework, we treat each GLCM as a realization of a two-dimensional stochastic functional process observed with error at discrete time points.
View Article and Find Full Text PDFY-box binding protein 1 (YB-1) is a regulatory protein associated with oncogenesis and poor prognosis in patients with cancer. In the cell, YB-1 functions as a DNA and RNA binding protein that promotes or suppresses expression of target genes. The cancer-promoting activity of YB-1 is mediated through its activation of oncogenes and repression of tumor suppressor genes.
View Article and Find Full Text PDFHuman cancer cell line experiments are valuable for investigating drug sensitivity biomarkers. The number of biomarkers measured in these experiments is typically on the order of several thousand, whereas the number of samples is often limited to one or at most three replicates for each experimental condition. We have developed an innovative Bayesian approach that efficiently identifies clusters of proteins that exhibit similar patterns of expression.
View Article and Find Full Text PDFIntegration of genomic data from multiple platforms has the capability to increase precision, accuracy, and statistical power in the identification of prognostic biomarkers. A fundamental problem faced in many multi-platform studies is unbalanced sample sizes due to the inability to obtain measurements from all the platforms for all the patients in the study. We have developed a novel Bayesian approach that integrates multi-regression models to identify a small set of biomarkers that can accurately predict time-to-event outcomes.
View Article and Find Full Text PDF