Publications by authors named "Ulisses M Braga-Neto"

Background: Production rates of the short-chain fatty acids (SCFA) acetate, propionate, and butyrate, which are beneficial metabolites of the intestinal microbiota, are difficult to measure in humans due to inaccessibility of the intestine to perform measurements, and the high first-pass metabolism of SCFAs in colonocytes and liver. We developed a stable tracer pulse approach to estimate SCFA whole-body production (WBP) in the accessible pool representing the systemic circulation and interstitial fluid. Compartmental modeling of plasma enrichment data allowed us to additionally calculate SCFA kinetics and pool sizes in the inaccessible pool likely representing the intestine with microbiota.

View Article and Find Full Text PDF

Observational case-control studies for biomarker discovery in cancer studies often collect data that are sampled separately from the case and control populations. We present an analysis of the bias in the estimation of the precision of classifiers designed on separately sampled data. The analysis consists of both theoretical and numerical results, which show that classifier precision estimates can display strong bias under separating sampling, with the bias magnitude depending on the difference between the true case prevalence in the population and the sample prevalence in the data.

View Article and Find Full Text PDF

We propose a novel methodology for fault detection and diagnosis in partially-observed Boolean dynamical systems (POBDS). These are stochastic, highly nonlinear, and derivativeless systems, rendering difficult the application of classical fault detection and diagnosis methods. The methodology comprises two main approaches.

View Article and Find Full Text PDF

Scientists are attempting to use models of ever-increasing complexity, especially in medicine, where gene-based diseases such as cancer require better modeling of cell regulation. Complex models suffer from uncertainty and experiments are needed to reduce this uncertainty. Because experiments can be costly and time-consuming, it is desirable to determine experiments providing the most useful information.

View Article and Find Full Text PDF

Control of gene regulatory networks (GRNs) to shift gene expression from undesirable states to desirable ones has received much attention in recent years. Most of the existing methods assume that the cost of intervention at each state and time point, referred to as the immediate cost function, is fully known. In this paper, we employ the Partially-Observed Boolean Dynamical System (POBDS) signal model for a time sequence of noisy expression measurement from a Boolean GRN and develop a Bayesian Inverse Reinforcement Learning (BIRL) approach to address the realistic case in which the only available knowledge regarding the immediate cost function is provided by the sequence of measurements and interventions recorded in an experimental setting by an expert.

View Article and Find Full Text PDF

We propose a methodology for model-based fault detection and diagnosis for stochastic Boolean dynamical systems indirectly observed through a single time series of transcriptomic measurements using Next Generation Sequencing (NGS) data. The fault detection consists of an innovations filter followed by a fault certification step, and requires no knowledge about the possible system faults. The innovations filter uses the optimal Boolean state estimator, called the Boolean Kalman Filter (BKF).

View Article and Find Full Text PDF

Background: Gene regulatory networks govern the function of key cellular processes, such as control of the cell cycle, response to stress, DNA repair mechanisms, and more. Boolean networks have been used successfully in modeling gene regulatory networks. In the Boolean network model, the transcriptional state of each gene is represented by 0 (inactive) or 1 (active), and the relationship among genes is represented by logical gates updated at discrete time points.

View Article and Find Full Text PDF

Proteomics promises to revolutionize cancer treatment and prevention by facilitating the discovery of molecular biomarkers. Progress has been impeded, however, by the small-sample, high-dimensional nature of proteomic data. We propose the application of a Bayesian approach to address this issue in classification of proteomic profiles generated by liquid chromatography-mass spectrometry (LC-MS).

View Article and Find Full Text PDF

The discrete coefficient of determination (CoD) measures the nonlinear interaction between discrete predictor and target variables and has had far-reaching applications in Genomic Signal Processing. Previous work has addressed the inference of the discrete CoD using classical parametric and nonparametric approaches. In this paper, we introduce a Bayesian framework for the inference of the discrete CoD.

View Article and Find Full Text PDF

Canalizing genes possess broad regulatory power over a wide swath of regulatory processes. On the other hand, it has been hypothesized that the phenomenon of intrinsically multivariate prediction (IMP) is associated with canalization. However, applications have relied on user-selectable thresholds on the IMP score to decide on the presence of IMP.

View Article and Find Full Text PDF

Background: Discovery and validation of protein biomarkers with high specificity is the main challenge of current proteomics studies. Different mass spectrometry models are used as shotgun tools for the discovery of biomarkers. Validation of a set of selected biomarkers from a list of candidates is an important stage in the biomarker identification pipeline.

View Article and Find Full Text PDF

Convex bootstrap error estimation is a popular tool for classifier error estimation in gene expression studies. A basic question is how to determine the weight for the convex combination between the basic bootstrap estimator and the resubstitution estimator such that the resulting estimator is unbiased at finite sample sizes. The well-known 0.

View Article and Find Full Text PDF

Motivation: It is commonly assumed in pattern recognition that cross-validation error estimation is 'almost unbiased' as long as the number of folds is not too small. While this is true for random sampling, it is not true with separate sampling, where the populations are independently sampled, which is a common situation in bioinformatics.

Results: We demonstrate, via analytical and numerical methods, that classical cross-validation can have strong bias under separate sampling, depending on the difference between the sampling ratios and the true population probabilities.

View Article and Find Full Text PDF

A statistic tool for the detection of multivariate Boolean relationships is presented, with applications in the inference of gene regulatory mechanisms. A statistical test is developed for the detection of a nonzero discrete coefficient of determination (CoD) between predictor and target variables. This is done by framing the problem in the context of a stochastic logic model that naturally allows the inclusion of prior knowledge if available.

View Article and Find Full Text PDF

Classification has emerged as a major area of investigation in bioinformatics owing to the desire to discriminate phenotypes, in particular, disease conditions, using high-throughput genomic data. While many classification rules have been posed, there is a paucity of error estimation rules and an even greater paucity of theory concerning error estimation accuracy. This is problematic because the worth of a classifier depends mainly on its error rate.

View Article and Find Full Text PDF

Background: RNA-Seq is the recently developed high-throughput sequencing technology for profiling the entire transcriptome in any organism. It has several major advantages over current hybridization-based approach such as microarrays. However, the cost per sample by RNA-Seq is still prohibitive for most laboratories.

View Article and Find Full Text PDF

From September 2005 to March 2007, 238 individuals being vaccinated for the first time with the yellow fever (YF) -17DD vaccine were enrolled in a cohort established in Recife, Brazil. A prospective study indicated that, after immunization, anti-YF immunoglobulin M (IgM) and anti-YF IgG were present in 70.6% (IgM) and 98.

View Article and Find Full Text PDF

Motivation: In small-sample settings, bolstered error estimation has been shown to perform better than cross-validation and competitively with bootstrap with regard to various criteria. The key issue for bolstering performance is the variance setting for the bolstering kernel. Heretofore, this variance has been determined in a non-parametric manner from the data.

View Article and Find Full Text PDF

Discrete classification is common in Genomic Signal Processing applications, in particular in classification of discretized gene expression data, and in discrete gene expression prediction and the inference of boolean genomic regulatory networks. Once a discrete classifier is obtained from sample data, its performance must be evaluated through its classification error. In practice, error estimation methods must then be employed to obtain reliable estimates of the classification error based on the available data.

View Article and Find Full Text PDF

Dengue disease can clinically evolve from an asymptomatic and mild disease, known as dengue fever (DF), to a severe disease known as dengue hemorrhagic fever (DHF). Recent evidence has shown how host genetic factors can be correlated with severe dengue susceptibility or protection. Many of these genes, such as CD209, TNF-a, vitamin D receptor, and FC gamma receptor IIA, are components of the innate immune system, suggesting that innate responses might have a role in dengue pathogenesis.

View Article and Find Full Text PDF

The development of DNA microarray technology a decade ago led to the establishment of functional genomics as one of the most active and successful scientific disciplines today. With the ongoing development of immunomic microarray technology-a spatially addressable, large-scale technology for measurement of specific immunological response-the new challenge of functional immunomics is emerging, which bears similarities to but is also significantly different from functional genomics. Immunonic data has been successfully used to identify biological markers involved in autoimmune diseases, allergies, viral infections such as human immunodeficiency virus (HIV), influenza, diabetes, and responses to cancer vaccines.

View Article and Find Full Text PDF

Motivation: Microarray classification typically possesses two striking attributes: (1) classifier design and error estimation are based on remarkably small samples and (2) cross-validation error estimation is employed in the majority of the papers. Thus, it is necessary to have a quantifiable understanding of the behavior of cross-validation in the context of very small samples.

Results: An extensive simulation study has been performed comparing cross-validation, resubstitution and bootstrap estimation for three popular classification rules-linear discriminant analysis, 3-nearest-neighbor and decision trees (CART)-using both synthetic and real breast-cancer patient data.

View Article and Find Full Text PDF