Publications by authors named "Alex Feltus"

Homo sapiens and Neanderthals underwent hybridization during the Middle/Upper Paleolithic age, culminating in retention of small amounts of Neanderthal-derived DNA in the modern human genome. In the current study, we address the potential roles Neanderthal single nucleotide polymorphisms (SNP) may be playing in autism susceptibility in samples of black non-Hispanic, white Hispanic, and white non-Hispanic people using data from the Simons Foundation Powering Autism Research (SPARK), Genotype-Tissue Expression (GTEx), and 1000 Genomes (1000G) databases. We have discovered that rare variants are significantly enriched in autistic probands compared to race-matched controls.

View Article and Find Full Text PDF

We report a public resource for examining the spatiotemporal RNA expression of 54,893 genes during the first 72 h of response to rhizobial inoculation. Using a methodology that allows synchronous inoculation and growth of more than 100 plants in a single media container, we harvested the same segment of each root responding to rhizobia in the initial inoculation over a time course, collected individual tissues from these segments with laser capture microdissection, and created and sequenced RNA libraries generated from these tissues. We demonstrate the utility of the resource by examining the expression patterns of a set of genes induced very early in nodule signaling, as well as two gene families (CLE peptides and nodule specific PLAT-domain proteins) and show that despite similar whole-root expression patterns, there are tissue differences in expression between the genes.

View Article and Find Full Text PDF

Summary: Large-scale and whole-cell modeling has multiple challenges, including scalable model building and module communication bottlenecks (e.g. between metabolism, gene expression, signaling, etc.

View Article and Find Full Text PDF

Mechanistic models of how single cells respond to different perturbations can help integrate disparate big data sets or predict response to varied drug combinations. However, the construction and simulation of such models have proved challenging. Here, we developed a python-based model creation and simulation pipeline that converts a few structured text files into an SBML standard and is high-performance- and cloud-computing ready.

View Article and Find Full Text PDF

Background: Thyroid cancer (THCA) is the most common endocrine malignancy and incidence is increasing. There is an urgent need to better understand the molecular differences between THCA tumors at different pathologic stages so appropriate diagnostic, prognostic, and treatment strategies can be applied. Transcriptome State Perturbation Generator (TSPG) is a tool created to identify the changes in gene expression necessary to transform the transcriptional state of a source sample to mimic that of a target.

View Article and Find Full Text PDF

Background: Quantification of gene expression from RNA-seq data is a prerequisite for transcriptome analysis such as differential gene expression analysis and gene co-expression network construction. Individual RNA-seq experiments are larger and combining multiple experiments from sequence repositories can result in datasets with thousands of samples. Processing hundreds to thousands of RNA-seq data can result in challenges related to data management, access to sufficient computational resources, navigation of high-performance computing (HPC) systems, installation of required software dependencies, and reproducibility.

View Article and Find Full Text PDF

The mechanisms that coordinate cellular gene expression are highly complex and intricately interconnected. Thus, it is necessary to move beyond a fully reductionist approach to understanding genetic information flow and begin focusing on the networked connections between genes that organize cellular function. Continued advancements in computational hardware, coupled with the development of gene correlation network algorithms, provide the capacity to study networked interactions between genes rather than their isolated functions.

View Article and Find Full Text PDF

Gene co-expression networks (GCNs) provide multiple benefits to molecular research including hypothesis generation and biomarker discovery. Transcriptome profiles serve as input for GCN construction and are derived from increasingly larger studies with samples across multiple experimental conditions, treatments, time points, genotypes, etc. Such experiments with larger numbers of variables confound discovery of true network edges, exclude edges and inhibit discovery of context (or condition) specific network edges.

View Article and Find Full Text PDF

Advanced imaging and DNA sequencing technologies now enable the diverse biology community to routinely generate and analyze terabytes of high resolution biological data. The community is rapidly heading toward the petascale in single investigator laboratory settings. As evidence, the single NCBI SRA central DNA sequence repository contains over 45 petabytes of biological data.

View Article and Find Full Text PDF

We introduce the Transcriptome State Perturbation Generator (TSPG) as a novel deep-learning method to identify changes in genomic expression that occur between tissue states using generative adversarial networks. TSPG learns the transcriptome perturbations from RNA-sequencing data required to shift from a source to a target class. We apply TSPG as an effective method of detecting biologically relevant alternate expression patterns between normal and tumor human tissue samples.

View Article and Find Full Text PDF

The human brain is a complex organ that consists of several regions each with a unique gene expression pattern. Our intent in this study was to construct a gene co-expression network (GCN) for the normal brain using RNA expression profiles from the Genotype-Tissue Expression (GTEx) project. The brain GCN contains gene correlation relationships that are broadly present in the brain or specific to thirteen brain regions, which we later combined into six overarching brain mini-GCNs based on the brain's structure.

View Article and Find Full Text PDF

Urgent responses to the COVID-19 pandemic depend on increased collaboration and sharing of data, models, and resources among scientists and researchers. In many scientific fields and disciplines, institutional norms treat data, models, and resources as proprietary, emphasizing competition among scientists and researchers locally and internationally. Concurrently, long-standing norms of open data and collaboration exist in some scientific fields and have accelerated within the last two decades.

View Article and Find Full Text PDF

Bigenic expression relationships are conventionally defined based on metrics such as Pearson or Spearman correlation that cannot typically detect latent, non-linear dependencies or require the relationship to be monotonic. Further, the combination of intrinsic and extrinsic noise as well as embedded relationships between sample sub-populations reduces the probability of extracting biologically relevant edges during the construction of gene co-expression networks (GCNs). In this report, we address these problems via our NetExtractor algorithm.

View Article and Find Full Text PDF

Online biological databases housing genomics, genetic and breeding data can be constructed using the Tripal toolkit. Tripal is an open-source, internationally developed framework that implements FAIR data principles and is meant to ease the burden of constructing such websites for research communities. Use of a common, open framework improves the sustainability and manageability of such as site.

View Article and Find Full Text PDF

Premise: Plants synthesize information from multiple environmental stimuli when determining their direction of growth. Gravity, being ubiquitous on Earth, plays a major role in determining the direction of growth and overall architecture of the plant. Here, we utilized the microgravity environment on board the International Space Station (ISS) to identify genes involved influencing growth and development of phototropically stimulated seedlings of Arabidopsis thaliana.

View Article and Find Full Text PDF

Gene co-expression networks (GCNs) are constructed from Gene Expression Matrices (GEMs) in a bottom up approach where all gene pairs are tested for correlation within the context of the input sample set. This approach is computationally intensive for many current GEMs and may not be scalable to millions of samples. Further, traditional GCNs do not detect non-linear relationships missed by correlation tests and do not place genetic relationships in a gene expression intensity context.

View Article and Find Full Text PDF

Community biological databases provide an important online resource for both public and private data, analysis tools and community engagement. These sites house genomic, transcriptomic, genetic, breeding and ancillary data for specific species, families or clades. Due to the complexity and increasing quantities of these data, construction of online resources is increasingly difficult especially with limited funding and access to technical expertise.

View Article and Find Full Text PDF

Given the complex relationship between gene expression and phenotypic outcomes, computationally efficient approaches are needed to sift through large high-dimensional datasets in order to identify biologically relevant biomarkers. In this report, we describe a method of identifying the most salient biomarker genes in a dataset, which we call "candidate genes", by evaluating the ability of gene combinations to classify samples from a dataset, which we call "classification potential". Our algorithm, Gene Oracle, uses a neural network to test user defined gene sets for polygenic classification potential and then uses a combinatorial approach to further decompose selected gene sets into candidate and non-candidate biomarker genes.

View Article and Find Full Text PDF

Motivation: As the size of high-throughput DNA sequence datasets continues to grow, the cost of transferring and storing the datasets may prevent their processing in all but the largest data centers or commercial cloud providers. To lower this cost, it should be possible to process only a subset of the original data while still preserving the biological information of interest.

Results: Using 4 high-throughput DNA sequence datasets of differing sequencing depth from 2 species as use cases, we demonstrate the effect of processing partial datasets on the number of detected RNA transcripts using an RNA-Seq workflow.

View Article and Find Full Text PDF

The objective of this study was to assess how exposure to ergot alkaloids during 2 stages of gestation alters fetal growth, muscle fiber formation, and miRNA expression. Pregnant ewes (n = 36; BW = 83.26 ± 8.

View Article and Find Full Text PDF

Renal cell carcinoma (RCC) subtypes are characterized by distinct molecular profiles. Using RNA expression profiles from 1,009 RCC samples, we constructed a condition-annotated gene coexpression network (GCN). The RCC GCN contains binary gene coexpression relationships (edges) specific to conditions including RCC subtype and tumor stage.

View Article and Find Full Text PDF

A gene co-expression network (GCN) describes associations between genes and points to genetic coordination of biochemical pathways. However, genetic correlations in a GCN are only detectable if they are present in the sampled conditions. With the increasing quantity of gene expression samples available in public repositories, there is greater potential for discovery of genetic correlations from a variety of biologically interesting conditions.

View Article and Find Full Text PDF
Article Synopsis
  • Introducing tumor-derived cells into normal mammary stem cell environments can lead these tumor cells to change and produce normal mammary cells, a process known as cancer cell redirection.
  • An in vitro model was created to simulate how the normal mammary microenvironment redirects cancer cells, allowing researchers to analyze the differences in gene expression between normal, redirected, and tumor cells using RNA profiling.
  • Through analyzing mutual information relationships, researchers identified 20 key molecular signatures from over 35,000 gene expressions, refining it down to 120 core biomarker genes that can differentiate between normal/redirection and tumor states.
View Article and Find Full Text PDF
Article Synopsis
  • - Tumorigenic cells can be transformed into normal cells when mixed with healthy epithelial cells and transplanted into juvenile mice, suggesting potential for reversing cancerous behavior.
  • - Researchers created an in vitro model that mimics this transformation and showed it’s linked to changes in gene expression, particularly focusing on growth factors and cytokines.
  • - Analysis of 251 genes revealed a shift in expression from cancerous patterns to normal patterns, providing insights into the mechanisms behind cancer cell redirection and identifying key growth factors involved.
View Article and Find Full Text PDF