Differential analysis of bulk RNA-seq data often suffers from lack of good controls. Here, we present a generative model that replaces controls, trained solely on healthy tissues. The unsupervised model learns a low-dimensional representation and can identify the closest normal representation for a given disease sample.
View Article and Find Full Text PDFOne way to better understand the structure in DNA is by learning to predict the sequence. Here, we trained a model to predict the missing base at any given position, given its left and right flanking contexts. Our best-performing model was a neural network that obtained an accuracy close to 54% on the human genome, which is 2% points better than modelling the data using a Markov model.
View Article and Find Full Text PDFThe flourish of the Internet of Things (IoT) and data-driven techniques provide new ideas for enhancing agricultural production, where evapotranspiration estimation is a crucial issue in crop irrigation systems. However, tremendous and unsynchronized data from agricultural cyber-physical systems bring large computational costs as well as complicate performing conventional machine learning methods. To precisely estimate evapotranspiration with acceptable computational costs under the background of IoT, we combine time granulation computing techniques and gradient boosting decision tree (GBDT) with Bayesian optimization (BO) to propose a hybrid machine learning approach.
View Article and Find Full Text PDFBackground: Genomic DNA has been shaped by mutational processes through evolution. The cellular machinery for error correction and repair has left its marks in the nucleotide composition along with structural and functional constraints. Therefore, the probability of observing a base in a certain position in the human genome is highly context-dependent.
View Article and Find Full Text PDFBMC Microbiol
September 2018
Background: Asthma, one of the most common chronic respiratory disorders, is associated with the hyper-activation of the T-cell subset of adaptive immunity. The gut microbiota may be involved in the development of asthma through the production of short-chain fatty acids (SCFAs), exhibiting modulatory effects on Th. So, we performed a metagenome-wide association study (MWAS) of the fecal microbiota from individuals with asthma and healthy controls.
View Article and Find Full Text PDFThe root nodule symbiosis of plants with nitrogen-fixing bacteria affects global nitrogen cycles and food production but is restricted to a subset of genera within a single clade of flowering plants. To explore the genetic basis for this scattered occurrence, we sequenced the genomes of 10 plant species covering the diversity of nodule morphotypes, bacterial symbionts, and infection strategies. In a genome-wide comparative analysis of a total of 37 plant species, we discovered signatures of multiple independent loss-of-function events in the indispensable symbiotic regulator in 10 of 13 genomes of nonnodulating species within this clade.
View Article and Find Full Text PDF