In multiple instance learning (MIL), a bag represents a sample that has a set of instances, each of which is described by a vector of explanatory variables, but the entire bag only has one label/response. Though many methods for MIL have been developed to date, few have paid attention to interpretability of models and results. The proposed Bayesian regression model stands on two levels of hierarchy, which transparently show how explanatory variables explain and instances contribute to bag responses. Moreover, two selection problems are simultaneously addressed; the instance selection to find out the instances in each bag responsible for the bag response, and the variable selection to search for the important covariates. To explore a joint discrete space of indicator variables created for selection of both explanatory variables and instances, the shotgun stochastic search algorithm is modified to fit in the MIL context. Also, the proposed model offers a natural and rigorous way to quantify uncertainty in coefficient estimation and outcome prediction, which many modern MIL applications call for. The simulation study shows the proposed regression model can select variables and instances with high performance (AUC greater than 0.86), thus predicting responses well. The proposed method is applied to the musk data for prediction of binding strengths (labels) between molecules (bags) with different conformations (instances) and target receptors. It outperforms all existing methods, and can identify variables relevant in modeling responses.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11027161PMC
http://dx.doi.org/10.1016/j.csda.2024.107954DOI Listing

Publication Analysis

Top Keywords

explanatory variables
12
variable selection
8
multiple instance
8
shotgun stochastic
8
stochastic search
8
regression model
8
variables instances
8
instances
6
variables
6
bag
5

Similar Publications

Purpose: Both clinical knowledge and patient care ownership (PCO) are crucial to the provision of quality patient care and should be acquired during training. However, the association between these two concepts is under-examined. Here, we conducted a nationwide cross-sectional study to investigate the association between clinical knowledge and PCO among resident physicians in Japan.

View Article and Find Full Text PDF

Objectives: To analyze the clinical and biological characteristics and to evaluate the risk factors associated with the mortality of patients with COVID-19 in Commune IV of the District of Bamako.

Methods: The cohort consisted of COVID-19 patients managed from March 2020 to June 2022 at the Bamako Dermatology Hospital and the Pasteur Polyclinic in Commune IV in Bamako. The studied variables were sociodemographic, clinical, and biological.

View Article and Find Full Text PDF

Crimean Congo hemorrhagic fever (CCHF) is a re-emerging tick-borne zoonosis that is caused by CCHF virus (CCHFV). The geographical distribution of the disease and factors that influence its occurrence are poorly known. We analysed historical records on its outbreaks in various countries across the sub-Saharan Africa (SSA) to identify hotspots and determine socioecological and demographicfactors associated with these outbreaks.

View Article and Find Full Text PDF

Rapid urbanization has significantly altered surface landscape configurations, leading to complex urban climates. While much attention has been focused on impervious surfaces' impact on extreme precipitation, a critical gap remains in understanding how various 2D urban landscape components influence extreme precipitation across different durations. Through an analysis of the non-stationarity and spatiotemporal variations in extreme precipitation across the Guangdong-Hong Kong-Macao Greater Bay Area (GBA) from 1990 to 2020, we constructed the non-stationary Generalized Additive Models for Location Scale and Shape (GAMLSS) model by introducing six urban landscape structural metrics as explanatory variables for each of the 27 meteorological stations in the GBA.

View Article and Find Full Text PDF

Task-free brain activity affords unique insight into the functional structure of brain network dynamics and has been used to identify neural markers of individual differences. In this work, we present an algorithmic optimization framework that directly inverts and parameterizes brain-wide dynamical-systems models involving hundreds of interacting neural populations, from single-subject M/EEG time-series recordings. This technique provides a powerful neurocomputational tool for interrogating mechanisms underlying individual brain dynamics ("precision brain models") and making quantitative predictions.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!