Smoking, alcohol consumption, obesity, and physical inactivity are key lifestyle risk factors for cancer. Previously these have been mostly examined singly or combined as an index, assuming independent and equivalent effects to cancer risk. The aim of our study was to systematically examine the joint pairwise and interactive effects of these lifestyle factors on the risk of a first solid primary cancer in a multi-cohort prospective setting.
View Article and Find Full Text PDFBackground: Several lifestyle factors are associated with an increased risk of colorectal cancer (CRC). Although lifestyle factors co-occur, in most previous studies these factors have been studied focusing upon a single risk factor or assuming independent effects between risk factors.
Aim: To examine the pairwise effects and interactions of smoking, alcohol consumption, physical inactivity, and body mass index (BMI) with risk of subsequent colorectal cancer (CRC).
Many applications, especially in physics and other sciences, call for easily interpretable and robust machine learning techniques. We propose a fully gradient-based technique for training radial basis function networks with an efficient and scalable open-source implementation. We derive novel closed-form optimization criteria for pruning the models for continuous as well as binary data which arise in a challenging real-world material physics problem.
View Article and Find Full Text PDFThe maximum parsimony (MP) method for inferring phylogenies is widely used, but little is known about its limitations in non-asymptotic situations. This study employs large-scale computations with simulated phylogenetic data to estimate the probability that MP succeeds in finding the true phylogeny for up to twelve taxa and 256 characters. The set of candidate phylogenies are taken to be unrooted binary trees; for each simulated data set, the tree lengths of all (2n - 5)!! candidates are computed to evaluate quantities related to the performance of MP, such as the probability of finding the true phylogeny, the probability that the tree with the shortest length is unique, the probability that the true phylogeny has the shortest tree length, and the expected inverse of the number of trees sharing the shortest length.
View Article and Find Full Text PDFBMC Bioinformatics
November 2015
Background: Statistical modeling of transcription factor binding sites is one of the classical fields in bioinformatics. The position weight matrix (PWM) model, which assumes statistical independence among all nucleotides in a binding site, used to be the standard model for this task for more than three decades but its simple assumptions are increasingly put into question. Recent high-throughput sequencing methods have provided data sets of sufficient size and quality for studying the benefits of more complex models.
View Article and Find Full Text PDF