Publications by Benjamin T Shealy

Publications by authors named "Benjamin T Shealy"

Page 1 of 1

Identification of condition-specific regulatory mechanisms in normal and cancerous human lung tissue.

Yuqing Hang Josh Burns Benjamin T Shealy Rini Pauly Stephen P Ficklin

BMC Genomics

May 2022

Background: Lung cancer is the leading cause of cancer death in both men and women. The most common lung cancer subtype is non-small cell lung carcinoma (NSCLC) comprising about 85% of all cases. NSCLC can be further divided into three subtypes: adenocarcinoma (LUAD), squamous cell carcinoma (LUSC), and large cell lung carcinoma.

View Article and Find Full Text PDF

GEMmaker: process massive RNA-seq datasets on heterogeneous computational infrastructure.

John A Hadish Tyler D Biggs Benjamin T Shealy M Reed Bender Coleman B McKnight

BMC Bioinformatics

May 2022

Background: Quantification of gene expression from RNA-seq data is a prerequisite for transcriptome analysis such as differential gene expression analysis and gene co-expression network construction. Individual RNA-seq experiments are larger and combining multiple experiments from sequence repositories can result in datasets with thousands of samples. Processing hundreds to thousands of RNA-seq data can result in challenges related to data management, access to sufficient computational resources, navigation of high-performance computing (HPC) systems, installation of required software dependencies, and reproducibility.

View Article and Find Full Text PDF

Addressing noise in co-expression network construction.

Joshua J R Burns Benjamin T Shealy Mitchell S Greer John A Hadish Matthew T McGowan

Brief Bioinform

January 2022

Gene co-expression networks (GCNs) provide multiple benefits to molecular research including hypothesis generation and biomarker discovery. Transcriptome profiles serve as input for GCN construction and are derived from increasingly larger studies with samples across multiple experimental conditions, treatments, time points, genotypes, etc. Such experiments with larger numbers of variables confound discovery of true network edges, exclude edges and inhibit discovery of context (or condition) specific network edges.

View Article and Find Full Text PDF

Cellular State Transformations Using Deep Learning for Precision Medicine Applications.

Colin Targonski M Reed Bender Benjamin T Shealy Benafsh Husain Bill Paseman

Patterns (N Y)

September 2020

We introduce the Transcriptome State Perturbation Generator (TSPG) as a novel deep-learning method to identify changes in genomic expression that occur between tissue states using generative adversarial networks. TSPG learns the transcriptome perturbations from RNA-sequencing data required to shift from a source to a target class. We apply TSPG as an effective method of detecting biologically relevant alternate expression patterns between normal and tumor human tissue samples.

View Article and Find Full Text PDF

NetExtractor: Extracting a Cerebellar Tissue Gene Regulatory Network Using Differentially Expressed High Mutual Information Binary RNA Profiles.

Benafsh Husain Allison R Hickman Yuqing Hang Benjamin T Shealy Karan Sapra

G3 (Bethesda)

September 2020

Bigenic expression relationships are conventionally defined based on metrics such as Pearson or Spearman correlation that cannot typically detect latent, non-linear dependencies or require the relationship to be monotonic. Further, the combination of intrinsic and extrinsic noise as well as embedded relationships between sample sub-populations reduces the probability of extracting biologically relevant edges during the construction of gene co-expression networks (GCNs). In this report, we address these problems via our NetExtractor algorithm.

View Article and Find Full Text PDF

Uncovering biomarker genes with enriched classification potential from Hallmark gene sets.

Colin A Targonski Courtney A Shearer Benjamin T Shealy Melissa C Smith F Alex Feltus

Sci Rep

July 2019

Given the complex relationship between gene expression and phenotypic outcomes, computationally efficient approaches are needed to sift through large high-dimensional datasets in order to identify biologically relevant biomarkers. In this report, we describe a method of identifying the most salient biomarker genes in a dataset, which we call "candidate genes", by evaluating the ability of gene combinations to classify samples from a dataset, which we call "classification potential". Our algorithm, Gene Oracle, uses a neural network to test user defined gene sets for polygenic classification potential and then uses a combinatorial approach to further decompose selected gene sets into candidate and non-candidate biomarker genes.

View Article and Find Full Text PDF