With the emergence of single-cell RNA sequencing (scRNA-seq) technology, scientists are able to examine gene expression at single-cell resolution. Analysis of scRNA-seq data has its own challenges, which stem from its high dimensionality. The method of machine learning comes with the potential of gene (feature) selection from the high-dimensional scRNA-seq data.
View Article and Find Full Text PDFUnlabelled: Number of children ever born to women of reproductive age forms a core component of fertility and is vital to the population dynamics in any country. Using Bangladesh Multiple Indicator Cluster Survey 2019 data, we fitted a novel weighted Bayesian Poisson regression model to identify multi-level individual, household, regional and societal factors of the number of children ever born among married women of reproductive age in Bangladesh. We explored the robustness of our results using multiple prior distributions, and presented the Metropolis algorithm for posterior realizations.
View Article and Find Full Text PDFStatistical thresholds occur when the changes in the relationships between a response and predictor variables are not linear but abrupt at some points of the predictor variable values. In this paper, we defined a piecewise-linear regression model which can detect two thresholds in the relationships via changes in slopes. We developed the corresponding Bayesian methodology for model estimation and inference by proposing prior distributions, deriving posterior distributions, and generating posterior values using Metropolis and Gibbs sampling algorithm.
View Article and Find Full Text PDFBackground And Objective: In binary classification problems with a rare class of interest, there is relatively little information available for the rare class to build a model. On the other hand, the number of useful variables to develop a model for classification can be high-dimensional. For example, in drug discovery, there are usually a very few bioactive compounds in a large chemical library, whereas thousands of potentially useful explanatory variables characterize a compound's chemical structure.
View Article and Find Full Text PDFChildhood stunting is a serious public health concern in Bangladesh. Earlier research used conventional statistical methods to identify the risk factors of stunting, and very little is known about the applications and usefulness of machine learning (ML) methods that can identify the risk factors of various health conditions based on complex data. This research evaluates the performance of ML methods in predicting stunting among under-5 aged children using 2014 Bangladesh Demographic and Health Survey data.
View Article and Find Full Text PDFDue to COVID-19, universities across Canada were forced to undergo a transition from classroom-based face-to-face learning and invigilated assessments to online-based learning and non-invigilated assessments. This study attempts to empirically measure the impact of COVID-19 on students' marks from eleven science, technology, engineering, and mathematics (STEM) courses using a Bayesian linear mixed effects model fitted to longitudinal data. The Bayesian linear mixed effects model is designed for this application which allows student-specific error variances to vary.
View Article and Find Full Text PDFThe relationships between an environmental variable and an ecological response are usually estimated by models fitted through the conditional mean of the response given environmental stress. For example, nonparametric loess and parametric piecewise linear regression model (PLRM) are often used to represent simple to complex nonlinear relationships. In contrast, piecewise linear quantile regression models (PQRM) fitted across various quantiles of the response can reveal nonlinearities in its range of variation across the explanatory variable.
View Article and Find Full Text PDFA quantitative structure-activity relationship (QSAR) is a model relating a specific biological response to the chemical structures of compounds. There are many descriptor sets available to characterize chemical structure, raising the question of how to choose among them or how to use all of them for training a QSAR model. Making efficient use of all sets of descriptors is particularly problematic when active compounds are rare among the assay response data.
View Article and Find Full Text PDF