Strong and shifting selective pressures of the Anthropocene are rapidly shaping phenomes and genomes of organisms worldwide. Crops expressing pesticidal proteins from Bacillus thuringiensis (Bt) represent one major selective force on insect genomes. Here we characterize a rapid response to selection by Bt crops in a major crop pest, Helicoverpa zea.
View Article and Find Full Text PDFBackground: Technological advances in sequencing and computation have allowed deep exploration of the molecular basis of diseases. Biological networks have proven to be a useful framework for interrogating omics data and modeling regulatory gene and protein interactions. Large collaborative projects, such as The Cancer Genome Atlas (TCGA), have provided a rich resource for building and validating new computational methods resulting in a plethora of open-source software for downloading, pre-processing, and analyzing those data.
View Article and Find Full Text PDFThere is increasing recognition that the sex chromosomes, X and Y, play an important role in health and disease that goes beyond the determination of biological sex. Loss of the Y chromosome (LOY) in blood, which occurs naturally in aging men, has been found to be a driver of cardiac fibrosis and heart failure mortality. LOY also occurs in most solid tumors in males and is often associated with worse survival, suggesting that LOY may give tumor cells a growth or survival advantage.
View Article and Find Full Text PDFAdvances in omics technologies have allowed spatially resolved molecular profiling of single cells, providing a window not only into the diversity and distribution of cell types within a tissue, but also into the effects of interactions between cells in shaping the transcriptional landscape. Cells send chemical and mechanical signals which are received by other cells, where they can subsequently initiate context-specific gene regulatory responses. These interactions and their responses shape the individual molecular phenotype of a cell in a given microenvironment.
View Article and Find Full Text PDFMotivation: Systems biology analyses often use correlations in gene expression profiles to infer co-expression networks that are then used as input for gene regulatory network inference or to identify functional modules of co-expressed or putatively co-regulated genes. While systematic biases, including batch effects, are known to induce spurious associations and confound differential gene expression analyses (DE), the impact of batch effects on gene co-expression has not been fully explored. Methods have been developed to adjust expression values, ensuring conditional independence of mean and variance from batch or other covariates for each gene, resulting in improved fidelity of DE analysis.
View Article and Find Full Text PDFGene regulatory networks (GRNs) are effective tools for inferring complex interactions between molecules that regulate biological processes and hence can provide insights into drivers of biological systems. Inferring coexpression networks is a critical element of GRN inference, as the correlation between expression patterns may indicate that genes are coregulated by common factors. However, methods that estimate coexpression networks generally derive an aggregate network representing the mean regulatory properties of the population and so fail to fully capture population heterogeneity.
View Article and Find Full Text PDFBackground: Lung adenocarcinoma (LUAD) has been observed to have significant sex differences in incidence, prognosis, and response to therapy. However, the molecular mechanisms responsible for these disparities have not been investigated extensively.
Methods: Sample-specific gene regulatory network methods were used to analyze RNA sequencing data from non-cancerous human lung samples from The Genotype Tissue Expression Project (GTEx) and lung adenocarcinoma primary tumor samples from The Cancer Genome Atlas (TCGA); results were validated on independent data.
Compared to men, women often develop COPD at an earlier age with worse respiratory symptoms despite lower smoking exposure. However, most preventive, and therapeutic strategies ignore biological sex differences in COPD. Our goal was to better understand sex-specific gene regulatory processes in lung tissue and the molecular basis for sex differences in COPD onset and severity.
View Article and Find Full Text PDFAging is the primary risk factor for many individual cancer types, including lung adenocarcinoma (LUAD). To understand how aging-related alterations in the regulation of key cellular processes might affect LUAD risk and survival outcomes, we built individual (person)-specific gene regulatory networks integrating gene expression, transcription factor protein-protein interaction, and sequence motif data, using PANDA/LIONESS algorithms, for both non-cancerous lung tissue samples from the Genotype Tissue Expression (GTEx) project and LUAD samples from The Cancer Genome Atlas (TCGA). In GTEx, we found that pathways involved in cell proliferation and immune response are increasingly targeted by regulatory transcription factors with age; these aging-associated alterations are accelerated by tobacco smoking and resemble oncogenic shifts in the regulatory landscape observed in LUAD and suggests that dysregulation of aging pathways might be associated with an increased risk of LUAD.
View Article and Find Full Text PDFComputational methods in biology can infer large molecular interaction networks from multiple data sources and at different resolutions, creating unprecedented opportunities to explore the mechanisms driving complex biological phenomena. Networks can be built to represent distinct conditions and compared to uncover graph-level differences-such as when comparing patterns of gene-gene interactions that change between biological states. Given the importance of the graph comparison problem, there is a clear and growing need for robust and scalable methods that can identify meaningful differences.
View Article and Find Full Text PDFBackground: Gene regulatory network (GRN) models that are formulated as ordinary differential equations (ODEs) can accurately explain temporal gene expression patterns and promise to yield new insights into important cellular processes, disease progression, and intervention design. Learning such gene regulatory ODEs is challenging, since we want to predict the evolution of gene expression in a way that accurately encodes the underlying GRN governing the dynamics and the nonlinear functional relationships between genes. Most widely used ODE estimation methods either impose too many parametric restrictions or are not guided by meaningful biological insights, both of which impede either scalability, explainability, or both.
View Article and Find Full Text PDFComplex traits are determined by many loci-mostly regulatory elements-that, through combinatorial interactions, can affect multiple traits. Such high levels of epistasis and pleiotropy have been proposed in the omnigenic model and may explain why such a large part of complex trait heritability is usually missed by genome-wide association studies while raising questions about the possibility for such traits to evolve in response to environmental constraints. To explore the molecular bases of complex traits and understand how they can adapt, we systematically analyzed the distribution of SNP heritability for ten traits across 29 tissue-specific Expression Quantitative Trait Locus (eQTL) networks.
View Article and Find Full Text PDFGene regulatory networks (GRNs) are effective tools for inferring complex interactions between molecules that regulate biological processes and hence can provide insights into drivers of biological systems. Inferring co-expression networks is a critical element of GRN inference as the correlation between expression patterns may indicate that genes are coregulated by common factors. However, methods that estimate co-expression networks generally derive an aggregate network representing the mean regulatory properties of the population and so fail to fully capture population heterogeneity.
View Article and Find Full Text PDFLung adenocarcinoma (LUAD) has been observed to have significant sex differences in incidence, prognosis, and response to therapy. However, the molecular mechanisms responsible for these disparities have not been investigated extensively. Sample-specific gene regulatory network methods were used to analyze RNA sequencing data from non-cancerous human lung samples from The Genotype Tissue Expression Project (GTEx) and lung adenocarcinoma primary tumor samples from The Cancer Genome Atlas (TCGA); results were validated on independent data.
View Article and Find Full Text PDFPurpose: To improve on current standards for breast cancer prognosis and prediction of chemotherapy benefit by developing a risk model that incorporates the gene expression-based "intrinsic" subtypes luminal A, luminal B, HER2-enriched, and basal-like.
Methods: A 50-gene subtype predictor was developed using microarray and quantitative reverse transcriptase polymerase chain reaction data from 189 prototype samples. Test sets from 761 patients (no systemic therapy) were evaluated for prognosis, and 133 patients were evaluated for prediction of pathologic complete response (pCR) to a taxane and anthracycline regimen.
Models that are formulated as ordinary differential equations (ODEs) can accurately explain temporal gene expression patterns and promise to yield new insights into important cellular processes, disease progression, and intervention design. Learning such ODEs is challenging, since we want to predict the evolution of gene expression in a way that accurately encodes the causal gene-regulatory network (GRN) governing the dynamics and the nonlinear functional relationships between genes. Most widely used ODE estimation methods either impose too many parametric restrictions or are not guided by meaningful biological insights, both of which impedes scalability and/or explainability.
View Article and Find Full Text PDFModeling dynamics of gene regulatory networks using ordinary differential equations (ODEs) allow a deeper understanding of disease progression and response to therapy, thus aiding in intervention optimization. Although there exist methods to infer regulatory ODEs, these are generally limited to small networks, rely on dimensional reduction, or impose non-biological parametric restrictions - all impeding scalability and explainability. PHOENIX is a neural ODE framework incorporating prior domain knowledge as soft constraints to infer sparse, biologically interpretable dynamics.
View Article and Find Full Text PDFInference and analysis of gene regulatory networks (GRNs) require software that integrates multi-omic data from various sources. The Network Zoo (netZoo; netzoo.github.
View Article and Find Full Text PDFThe increasing quantity of multi-omic data, such as methylomic and transcriptomic profiles collected on the same specimen or even on the same cell, provides a unique opportunity to explore the complex interactions that define cell phenotype and govern cellular responses to perturbations. We propose a network approach based on Gaussian Graphical Models (GGMs) that facilitates the joint analysis of paired omics data. This method, called DRAGON (Determining Regulatory Associations using Graphical models on multi-Omic Networks), calibrates its parameters to achieve an optimal trade-off between the network's complexity and estimation accuracy, while explicitly accounting for the characteristics of each of the assessed omics 'layers.
View Article and Find Full Text PDFCancer heterogeneities hold the key to a deeper understanding of cancer etiology and progression and the discovery of more precise cancer therapy. Modern pathological and molecular technologies offer a powerful set of tools to profile tumor heterogeneities at multiple levels in large patient populations, from DNA to RNA, protein and epigenetics, and from tumor tissues to tumor microenvironment and liquid biopsy. When coupled with well-validated epidemiologic methodology and well-characterized epidemiologic resources, the rich tumor pathological and molecular tumor information provide new research opportunities at an unprecedented breadth and depth.
View Article and Find Full Text PDFBackground: Interstitial lung abnormalities (ILA) are radiologic findings that may progress to idiopathic pulmonary fibrosis (IPF). Blood gene expression profiles can predict IPF mortality, but whether these same genes associate with ILA and ILA outcomes is unknown. This study evaluated if a previously described blood gene expression profile associated with IPF mortality is associated with ILA and all-cause mortality.
View Article and Find Full Text PDFExpression quantitative trait locus (eQTL) analysis associates SNPs with gene expression; these relationships can be represented as a bipartite network with association strength as "edge weights" between SNPs and genes. However, most eQTL networks use binary edge weights based on thresholded FDR estimates: definitions that influence reproducibility and downstream analyses. We constructed twenty-nine tissue-specific eQTL networks using GTEx data and evaluated a comprehensive set of network specifications based on false discovery rates, test statistics, and p values, focusing on the degree centrality-a metric of an SNP or gene node's potential network influence.
View Article and Find Full Text PDFUnderstanding how each person's unique genotype influences their individual patterns of gene regulation has the potential to improve our understanding of human health and development, and to refine genotype-specific disease risk assessments and treatments. However, the effects of genetic variants are not typically considered when constructing gene regulatory networks, despite the fact that many disease-associated genetic variants are thought to have regulatory effects, including the disruption of transcription factor (TF) binding. We developed EGRET (Estimating the Genetic Regulatory Effect on TFs), which infers a genotype-specific gene regulatory network for each individual in a study population.
View Article and Find Full Text PDFGene regulatory network inference allows for the modeling of genome-scale regulatory processes that are altered during development, in disease, and in response to perturbations. Our group has developed a collection of tools to model various regulatory processes, including transcriptional (PANDA, SPIDER) and post-transcriptional (PUMA) gene regulation, as well as gene regulation in individual samples (LIONESS). These methods work by postulating a network structure and then optimizing that structure to be consistent with multiple lines of biological evidence through repeated operations on data matrices.
View Article and Find Full Text PDF