Genetic perturbation screens using RNA interference (RNAi) have been conducted successfully to identify host factors that are essential for the life cycle of bacteria or viruses. So far, most published studies identified host factors primarily for single pathogens. Furthermore, often only a small subset of genes, e.
View Article and Find Full Text PDFBMC Bioinformatics
November 2019
Background: Analysing large and high-dimensional biological data sets poses significant computational difficulties for bioinformaticians due to lack of accessible tools that scale to hundreds of millions of data points.
Results: We developed a novel machine learning command line tool called PyBDA for automated, distributed analysis of big biological data sets. By using Apache Spark in the backend, PyBDA scales to data sets beyond the size of current applications.
Summary: Modelling biological associations or dependencies using linear regression is often complicated when the analyzed data-sets are high-dimensional and less observations than variables are available (n ≪ p). For genomic data-sets penalized regression methods have been applied settling this issue. Recently proposed regression models utilize prior knowledge on dependencies, e.
View Article and Find Full Text PDFExisting machine-readable resources for large-scale gene regulatory networks usually do not provide context information characterizing the activating conditions for a regulation and how targeted genes are affected. Although this information is essentially required for data interpretation, available networks are often restricted to not condition-dependent, non-quantitative, plain binary interactions as derived from high-throughput screens. In this article, we present a comprehensive Petri net based regulatory network that controls the diauxic shift in Saccharomyces cerevisiae.
View Article and Find Full Text PDF