Publications by Geo Pertea

Publications by authors named "Geo Pertea"

Page 1 of 2

Molecular impact of nicotine and smoking exposure on the developing and adult mouse brain.

Daianna Gonzalez-Padilla Nicholas J Eagles Marisol Cano Geo Pertea Andrew E Jaffe

bioRxiv

November 2024

Article Synopsis

- Maternal smoking during pregnancy has notable negative effects on the cognitive and behavioral development of offspring, with this study specifically comparing the impacts of nicotine and cigarette smoke on gene expression in developing brains.
- Researchers found a significant number of differentially expressed genes (1,010 for nicotine and 4,165 for smoking) linked to prenatal exposure, indicating different neurodevelopmental pathways affected by each substance.
- The findings suggest that while both prenatal nicotine exposure and maternal smoking have specific and overlapping effects on the developing brain, these effects are not replicated in the adult brain, highlighting developmental-stage sensitivity to smoke-related changes.

View Article and Find Full Text PDF

Systems biology dissection of PTSD and MDD across brain regions, cell types, and blood.

Nikolaos P Daskalakis Artemis Iatrou Chris Chatzinakos Aarti Jajoo Clara Snijders Geo Pertea

Science

May 2024

The molecular pathology of stress-related disorders remains elusive. Our brain multiregion, multiomic study of posttraumatic stress disorder (PTSD) and major depressive disorder (MDD) included the central nucleus of the amygdala, hippocampal dentate gyrus, and medial prefrontal cortex (mPFC). Genes and exons within the mPFC carried most disease signals replicated across two independent cohorts.

View Article and Find Full Text PDF

Analysis of gene expression in the postmortem brain of neurotypical Black Americans reveals contributions of genetic ancestry.

Kynon J M Benjamin Qiang Chen Nicholas J Eagles Louise A Huuki-Myers Leonardo Collado-Torres Geo Pertea

Nat Neurosci

June 2024

Article Synopsis

Ancestral differences in genomic variation influence gene expression, with most studies focusing on European samples or adjusting for ancestry, rather than specifically examining it.
This study explored how genetic ancestry impacts gene expression and DNA methylation in brain tissue from admixed Black American individuals, revealing ancestry-related genes primarily involved in immune response and vascular tissue rather than neurons.
The identified ancestry-associated differentially expressed genes (DEGs) contribute to heritability for various conditions like ischemic stroke, Parkinson's, and Alzheimer's, highlighting significant differences in gene expression based on genetic ancestry and its implications for brain-related illnesses.

View Article and Find Full Text PDF

Sex affects transcriptional associations with schizophrenia across the dorsolateral prefrontal cortex, hippocampus, and caudate nucleus.

Kynon J M Benjamin Ria Arora Arthur S Feltrin Geo Pertea Hunter H Giles

Nat Commun

May 2024

Schizophrenia is a complex neuropsychiatric disorder with sexually dimorphic features, including differential symptomatology, drug responsiveness, and male incidence rate. Prior large-scale transcriptome analyses for sex differences in schizophrenia have focused on the prefrontal cortex. Analyzing BrainSeq Consortium data (caudate nucleus: n = 399, dorsolateral prefrontal cortex: n = 377, and hippocampus: n = 394), we identified 831 unique genes that exhibit sex differences across brain regions, enriched for immune-related pathways.

View Article and Find Full Text PDF

Transcriptomic analysis of the human habenula in schizophrenia.

Ege A Yalcinbas Bukola Ajanaku Erik D Nelson Renee Garcia-Flores Nicholas J Eagles Geo Pertea

bioRxiv

July 2024

Article Synopsis

- The study investigates the habenula (Hb) in relation to schizophrenia (SCZD) by examining its cell types and how their transcriptomic profiles differ in individuals with SCZD compared to healthy controls.
- Researchers used advanced techniques like single nucleus RNA-sequencing and fluorescent hybridization to identify 17 distinct cell types in the human Hb and validated these findings.
- They discovered 45 genes that are differentially expressed in the Hb of SCZD individuals, revealing significant genetic changes and providing new insights into the molecular basis of neuropsychiatric disorders.

View Article and Find Full Text PDF

Genetic and environmental contributions to ancestry differences in gene expression in the human brain.

Kynon J M Benjamin Qiang Chen Nicholas J Eagles Louise A Huuki-Myers Leonardo Collado-Torres Geo Pertea

bioRxiv

October 2023

Article Synopsis

* Ancestry-associated differentially expressed genes (DEGs) are linked to immune response and vascular tissue, contributing to heritability for conditions like ischemic stroke, Parkinson's disease, and Alzheimer's disease, while showing less influence on psychiatric traits.
* The study reveals that both genetic variation and environmental factors (like DNA methylation) shape gene expression differences across ancestry, impacting the risk of brain illnesses in diverse populations.

View Article and Find Full Text PDF

Improved transcriptome assembly using a hybrid of long and short reads with StringTie.

Alaina Shumate Brandon Wong Geo Pertea Mihaela Pertea

PLoS Comput Biol

June 2022

Short-read RNA sequencing and long-read RNA sequencing each have their strengths and weaknesses for transcriptome assembly. While short reads are highly accurate, they are rarely able to span multiple exons. Long-read technology can capture full-length transcripts, but its relatively high error rate often leads to mis-identified splice sites.

View Article and Find Full Text PDF

TieBrush: an efficient method for aggregating and summarizing mapped reads across large datasets.

Ales Varabyou Geo Pertea Christopher Pockrandt Mihaela Pertea

Bioinformatics

October 2021

Summary: Although the ability to programmatically summarize and visually inspect sequencing data is an integral part of genome analysis, currently available methods are not capable of handling large numbers of samples. In particular, making a visual comparison of transcriptional landscapes between two sets of thousands of RNA-seq samples is limited by available computational resources, which can be overwhelmed due to the sheer size of the data. In this work, we present TieBrush, a software package designed to process very large sequencing datasets (RNA, whole-genome, exome, etc.

View Article and Find Full Text PDF

GFF Utilities: GffRead and GffCompare.

Geo Pertea Mihaela Pertea

F1000Res

September 2020

View Article and Find Full Text PDF

Genome assembly and characterization of a complex zfBED-NLR gene-containing disease resistance locus in Carolina Gold Select rice with Nanopore sequencing.

Andrew C Read Matthew J Moscou Aleksey V Zimin Geo Pertea Rachel S Meyer

PLoS Genet

January 2020

Long-read sequencing facilitates assembly of complex genomic regions. In plants, loci containing nucleotide-binding, leucine-rich repeat (NLR) disease resistance genes are an important example of such regions. NLR genes constitute one of the largest gene families in plants and are often clustered, evolving via duplication, contraction, and transposition.

View Article and Find Full Text PDF

Transcriptome assembly from long-read RNA-seq alignments with StringTie2.

Sam Kovaka Aleksey V Zimin Geo M Pertea Roham Razaghi Steven L Salzberg Mihaela Pertea

Genome Biol

December 2019

RNA sequencing using the latest single-molecule sequencing instruments produces reads that are thousands of nucleotides long. The ability to assemble these long reads can greatly improve the sensitivity of long-read analyses. Here we present StringTie2, a reference-guided transcriptome assembler that works with both short and long reads.

View Article and Find Full Text PDF

CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise.

Mihaela Pertea Alaina Shumate Geo Pertea Ales Varabyou Florian P Breitwieser

Genome Biol

November 2018

We assembled the sequences from deep RNA sequencing experiments by the Genotype-Tissue Expression (GTEx) project, to create a new catalog of human genes and transcripts, called CHESS. The new database contains 42,611 genes, of which 20,352 are potentially protein-coding and 22,259 are noncoding, and a total of 323,258 transcripts. These include 224 novel protein-coding genes and 116,156 novel transcripts.

View Article and Find Full Text PDF

Applying Rapid Whole-Genome Sequencing To Predict Phenotypic Antimicrobial Susceptibility Testing Results among Carbapenem-Resistant Klebsiella pneumoniae Clinical Isolates.

Pranita D Tamma Yunfan Fan Yehudit Bergman Geo Pertea Abida Q Kazmi

Antimicrob Agents Chemother

January 2019

Standard antimicrobial susceptibility testing (AST) approaches lead to delays in the selection of optimal antimicrobial therapy. Here, we sought to determine the accuracy of antimicrobial resistance (AMR) determinants identified by Nanopore whole-genome sequencing in predicting AST results. Using a cohort of 40 clinical isolates (21 carbapenemase-producing carbapenem-resistant , 10 non-carbapenemase-producing carbapenem-resistant , and 9 carbapenem-susceptible isolates), three separate sequencing and analysis pipelines were performed, as follows: (i) a real-time Nanopore analysis approach identifying acquired AMR genes, (ii) an assembly-based Nanopore approach identifying acquired AMR genes and chromosomal mutations, and (iii) an approach using short-read correction of Nanopore assemblies.

View Article and Find Full Text PDF

Genome sequence of the progenitor of the wheat D genome Aegilops tauschii.

Ming-Cheng Luo Yong Q Gu Daniela Puiu Hao Wang Sven O Twardziok Geo Pertea

Nature

November 2017

Article Synopsis

Aegilops tauschii is the diploid ancestor of the D genome in hexaploid wheat and serves as a crucial genetic resource for wheat development.* -
Researchers utilized advanced sequencing techniques to create a high-quality reference genome for a specific strain of Ae. tauschii, revealing a genome rich in repetitive sequences and dispersed duplicated genes.* -
The study indicates that the unique repetitive structure of the Ae. tauschii genome contributes to rapid evolutionary changes in its chromosomes, affecting gene duplication and recombination rates.*

View Article and Find Full Text PDF

First Draft Genome Sequence of the Pathogenic Fungus (Formerly ).

Ruibang Luo Aleksey Zimin Rachael Workman Yunfan Fan Geo Pertea

G3 (Bethesda)

November 2017

Here we describe the sequencing and assembly of the pathogenic fungus using a combination of short, highly accurate Illumina reads and additional coverage in very long Oxford Nanopore reads. The resulting assembly is highly contiguous, containing a total of 37,627,092 bp with over 98% of the sequence in just 26 scaffolds. Annotation identified 8896 protein-coding genes.

View Article and Find Full Text PDF

The Douglas-Fir Genome Sequence Reveals Specialization of the Photosynthetic Apparatus in Pinaceae.

David B Neale Patrick E McGuire Nicholas C Wheeler Kristian A Stevens Marc W Crepeau Geo M Pertea

G3 (Bethesda)

September 2017

A reference genome sequence for (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50 = 340,704 bp).

View Article and Find Full Text PDF

Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown.

Mihaela Pertea Daehwan Kim Geo M Pertea Jeffrey T Leek Steven L Salzberg

Nat Protoc

September 2016

High-throughput sequencing of mRNA (RNA-seq) has become the standard method for measuring and comparing the levels of gene expression in a wide variety of species and conditions. RNA-seq experiments generate very large, complex data sets that demand fast, accurate and flexible software to reduce the raw read data to comprehensible results. HISAT (hierarchical indexing for spliced alignment of transcripts), StringTie and Ballgown are free, open-source software tools for comprehensive analysis of RNA-seq experiments.

View Article and Find Full Text PDF

Ballgown bridges the gap between transcriptome assembly and expression analysis.

Alyssa C Frazee Geo Pertea Andrew E Jaffe Ben Langmead Steven L Salzberg

Nat Biotechnol

March 2015

View Article and Find Full Text PDF

StringTie enables improved reconstruction of a transcriptome from RNA-seq reads.

Mihaela Pertea Geo M Pertea Corina M Antonescu Tsung-Cheng Chang Joshua T Mendell

Nat Biotechnol

March 2015

Methods used to sequence the transcriptome often produce more than 200 million short sequences. We introduce StringTie, a computational method that applies a network flow algorithm originally developed in optimization theory, together with optional de novo assembly, to assemble these complex data sets into transcripts. When used to analyze both simulated and real data sets, StringTie produces more complete and accurate reconstructions of genes and better estimates of expression levels, compared with other leading transcript assembly programs including Cufflinks, IsoLasso, Scripture and Traph.

View Article and Find Full Text PDF

TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions.

Daehwan Kim Geo Pertea Cole Trapnell Harold Pimentel Ryan Kelley

Genome Biol

April 2013

TopHat is a popular spliced aligner for RNA-sequence (RNA-seq) experiments. In this paper, we describe TopHat2, which incorporates many significant enhancements to TopHat. TopHat2 can align reads of various lengths produced by the latest sequencing technologies, while allowing for variable-length indels with respect to the reference genome.

View Article and Find Full Text PDF

Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks.

Cole Trapnell Adam Roberts Loyal Goff Geo Pertea Daehwan Kim

Nat Protoc

March 2012

Recent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes and splice variants and quantify expression genome-wide in a single assay. The volume and complexity of data from RNA-seq experiments necessitate scalable, fast and mathematically principled analysis software. TopHat and Cufflinks are free, open-source software tools for gene discovery and comprehensive expression analysis of high-throughput mRNA sequencing (RNA-seq) data.

View Article and Find Full Text PDF

Detection of lineage-specific evolutionary changes among primate species.

Mihaela Pertea Geo M Pertea Steven L Salzberg

BMC Bioinformatics

July 2011

Background: Comparison of the human genome with other primates offers the opportunity to detect evolutionary events that created the diverse phenotypes among the primate species. Because the primate genomes are highly similar to one another, methods developed for analysis of more divergent species do not always detect signs of evolutionary selection.

Results: We have developed a new method, called DivE, specifically designed to find regions that have evolved either more or less rapidly than expected, for any clade within a set of very closely related species.

View Article and Find Full Text PDF

Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis.

Rami A Dalloul Julie A Long Aleksey V Zimin Luqman Aslam Kathryn Beal Geo Pertea

PLoS Biol

September 2010

A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo). Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.

View Article and Find Full Text PDF

Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.

Cole Trapnell Brian A Williams Geo Pertea Ali Mortazavi Gordon Kwan

Nat Biotechnol

May 2010

High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks.

View Article and Find Full Text PDF

A whole-genome assembly of the domestic cow, Bos taurus.

Aleksey V Zimin Arthur L Delcher Liliana Florea David R Kelley Michael C Schatz Geo Pertea

Genome Biol

September 2009

Background: The genome of the domestic cow, Bos taurus, was sequenced using a mixture of hierarchical and whole-genome shotgun sequencing methods.

Results: We have assembled the 35 million sequence reads and applied a variety of assembly improvement techniques, creating an assembly of 2.86 billion base pairs that has multiple improvements over previous assemblies: it is more complete, covering more of the genome; thousands of gaps have been closed; many erroneous inversions, deletions, and translocations have been corrected; and thousands of single-nucleotide errors have been corrected.

View Article and Find Full Text PDF

Publications by authors named "Geo Pertea"

Molecular impact of nicotine and smoking exposure on the developing and adult mouse brain.

Article Synopsis

Systems biology dissection of PTSD and MDD across brain regions, cell types, and blood.

Analysis of gene expression in the postmortem brain of neurotypical Black Americans reveals contributions of genetic ancestry.

Article Synopsis

Sex affects transcriptional associations with schizophrenia across the dorsolateral prefrontal cortex, hippocampus, and caudate nucleus.

Transcriptomic analysis of the human habenula in schizophrenia.

Article Synopsis

Genetic and environmental contributions to ancestry differences in gene expression in the human brain.

Article Synopsis

Improved transcriptome assembly using a hybrid of long and short reads with StringTie.

TieBrush: an efficient method for aggregating and summarizing mapped reads across large datasets.

GFF Utilities: GffRead and GffCompare.

Genome assembly and characterization of a complex zfBED-NLR gene-containing disease resistance locus in Carolina Gold Select rice with Nanopore sequencing.

Transcriptome assembly from long-read RNA-seq alignments with StringTie2.

CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise.

Applying Rapid Whole-Genome Sequencing To Predict Phenotypic Antimicrobial Susceptibility Testing Results among Carbapenem-Resistant Klebsiella pneumoniae Clinical Isolates.

Genome sequence of the progenitor of the wheat D genome Aegilops tauschii.

Article Synopsis

First Draft Genome Sequence of the Pathogenic Fungus (Formerly ).

The Douglas-Fir Genome Sequence Reveals Specialization of the Photosynthetic Apparatus in Pinaceae.

Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown.

Ballgown bridges the gap between transcriptome assembly and expression analysis.

StringTie enables improved reconstruction of a transcriptome from RNA-seq reads.

TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions.

Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks.

Detection of lineage-specific evolutionary changes among primate species.

Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis.

Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.

A whole-genome assembly of the domestic cow, Bos taurus.

A PHP Error was encountered

A PHP Error was encountered