Publications by authors named "Pertea G"

Article Synopsis
  • - Maternal smoking during pregnancy has notable negative effects on the cognitive and behavioral development of offspring, with this study specifically comparing the impacts of nicotine and cigarette smoke on gene expression in developing brains.
  • - Researchers found a significant number of differentially expressed genes (1,010 for nicotine and 4,165 for smoking) linked to prenatal exposure, indicating different neurodevelopmental pathways affected by each substance.
  • - The findings suggest that while both prenatal nicotine exposure and maternal smoking have specific and overlapping effects on the developing brain, these effects are not replicated in the adult brain, highlighting developmental-stage sensitivity to smoke-related changes.
View Article and Find Full Text PDF

The molecular pathology of stress-related disorders remains elusive. Our brain multiregion, multiomic study of posttraumatic stress disorder (PTSD) and major depressive disorder (MDD) included the central nucleus of the amygdala, hippocampal dentate gyrus, and medial prefrontal cortex (mPFC). Genes and exons within the mPFC carried most disease signals replicated across two independent cohorts.

View Article and Find Full Text PDF
Article Synopsis
  • Ancestral differences in genomic variation influence gene expression, with most studies focusing on European samples or adjusting for ancestry, rather than specifically examining it.
  • This study explored how genetic ancestry impacts gene expression and DNA methylation in brain tissue from admixed Black American individuals, revealing ancestry-related genes primarily involved in immune response and vascular tissue rather than neurons.
  • The identified ancestry-associated differentially expressed genes (DEGs) contribute to heritability for various conditions like ischemic stroke, Parkinson's, and Alzheimer's, highlighting significant differences in gene expression based on genetic ancestry and its implications for brain-related illnesses.
View Article and Find Full Text PDF

Schizophrenia is a complex neuropsychiatric disorder with sexually dimorphic features, including differential symptomatology, drug responsiveness, and male incidence rate. Prior large-scale transcriptome analyses for sex differences in schizophrenia have focused on the prefrontal cortex. Analyzing BrainSeq Consortium data (caudate nucleus: n = 399, dorsolateral prefrontal cortex: n = 377, and hippocampus: n = 394), we identified 831 unique genes that exhibit sex differences across brain regions, enriched for immune-related pathways.

View Article and Find Full Text PDF
Article Synopsis
  • - The study investigates the habenula (Hb) in relation to schizophrenia (SCZD) by examining its cell types and how their transcriptomic profiles differ in individuals with SCZD compared to healthy controls.
  • - Researchers used advanced techniques like single nucleus RNA-sequencing and fluorescent hybridization to identify 17 distinct cell types in the human Hb and validated these findings.
  • - They discovered 45 genes that are differentially expressed in the Hb of SCZD individuals, revealing significant genetic changes and providing new insights into the molecular basis of neuropsychiatric disorders.
View Article and Find Full Text PDF
Article Synopsis
  • * Ancestry-associated differentially expressed genes (DEGs) are linked to immune response and vascular tissue, contributing to heritability for conditions like ischemic stroke, Parkinson's disease, and Alzheimer's disease, while showing less influence on psychiatric traits.
  • * The study reveals that both genetic variation and environmental factors (like DNA methylation) shape gene expression differences across ancestry, impacting the risk of brain illnesses in diverse populations.
View Article and Find Full Text PDF

Short-read RNA sequencing and long-read RNA sequencing each have their strengths and weaknesses for transcriptome assembly. While short reads are highly accurate, they are rarely able to span multiple exons. Long-read technology can capture full-length transcripts, but its relatively high error rate often leads to mis-identified splice sites.

View Article and Find Full Text PDF

Summary: Although the ability to programmatically summarize and visually inspect sequencing data is an integral part of genome analysis, currently available methods are not capable of handling large numbers of samples. In particular, making a visual comparison of transcriptional landscapes between two sets of thousands of RNA-seq samples is limited by available computational resources, which can be overwhelmed due to the sheer size of the data. In this work, we present TieBrush, a software package designed to process very large sequencing datasets (RNA, whole-genome, exome, etc.

View Article and Find Full Text PDF

Long-read sequencing facilitates assembly of complex genomic regions. In plants, loci containing nucleotide-binding, leucine-rich repeat (NLR) disease resistance genes are an important example of such regions. NLR genes constitute one of the largest gene families in plants and are often clustered, evolving via duplication, contraction, and transposition.

View Article and Find Full Text PDF

RNA sequencing using the latest single-molecule sequencing instruments produces reads that are thousands of nucleotides long. The ability to assemble these long reads can greatly improve the sensitivity of long-read analyses. Here we present StringTie2, a reference-guided transcriptome assembler that works with both short and long reads.

View Article and Find Full Text PDF

We assembled the sequences from deep RNA sequencing experiments by the Genotype-Tissue Expression (GTEx) project, to create a new catalog of human genes and transcripts, called CHESS. The new database contains 42,611 genes, of which 20,352 are potentially protein-coding and 22,259 are noncoding, and a total of 323,258 transcripts. These include 224 novel protein-coding genes and 116,156 novel transcripts.

View Article and Find Full Text PDF

Standard antimicrobial susceptibility testing (AST) approaches lead to delays in the selection of optimal antimicrobial therapy. Here, we sought to determine the accuracy of antimicrobial resistance (AMR) determinants identified by Nanopore whole-genome sequencing in predicting AST results. Using a cohort of 40 clinical isolates (21 carbapenemase-producing carbapenem-resistant , 10 non-carbapenemase-producing carbapenem-resistant , and 9 carbapenem-susceptible isolates), three separate sequencing and analysis pipelines were performed, as follows: (i) a real-time Nanopore analysis approach identifying acquired AMR genes, (ii) an assembly-based Nanopore approach identifying acquired AMR genes and chromosomal mutations, and (iii) an approach using short-read correction of Nanopore assemblies.

View Article and Find Full Text PDF
Article Synopsis
  • Aegilops tauschii is the diploid ancestor of the D genome in hexaploid wheat and serves as a crucial genetic resource for wheat development.* -
  • Researchers utilized advanced sequencing techniques to create a high-quality reference genome for a specific strain of Ae. tauschii, revealing a genome rich in repetitive sequences and dispersed duplicated genes.* -
  • The study indicates that the unique repetitive structure of the Ae. tauschii genome contributes to rapid evolutionary changes in its chromosomes, affecting gene duplication and recombination rates.*
View Article and Find Full Text PDF

Here we describe the sequencing and assembly of the pathogenic fungus using a combination of short, highly accurate Illumina reads and additional coverage in very long Oxford Nanopore reads. The resulting assembly is highly contiguous, containing a total of 37,627,092 bp with over 98% of the sequence in just 26 scaffolds. Annotation identified 8896 protein-coding genes.

View Article and Find Full Text PDF
Article Synopsis
  • A reference genome for Coastal Douglas-fir has been established, marking a significant advancement in Pinaceae family genomics.
  • The genome assembly features exceptional quality, with contig N50 at 44,136 bp and scaffold N50 at 340,704 bp, surpassing other conifer genomes, partly due to improved sequencing technologies and lower repeat content.
  • Comparative analysis shows distinct gene-family dynamics in Douglas-fir compared to angiosperms, revealing insights into traits like shade tolerance and contributing to our understanding of plant evolutionary differences.
View Article and Find Full Text PDF

High-throughput sequencing of mRNA (RNA-seq) has become the standard method for measuring and comparing the levels of gene expression in a wide variety of species and conditions. RNA-seq experiments generate very large, complex data sets that demand fast, accurate and flexible software to reduce the raw read data to comprehensible results. HISAT (hierarchical indexing for spliced alignment of transcripts), StringTie and Ballgown are free, open-source software tools for comprehensive analysis of RNA-seq experiments.

View Article and Find Full Text PDF

Methods used to sequence the transcriptome often produce more than 200 million short sequences. We introduce StringTie, a computational method that applies a network flow algorithm originally developed in optimization theory, together with optional de novo assembly, to assemble these complex data sets into transcripts. When used to analyze both simulated and real data sets, StringTie produces more complete and accurate reconstructions of genes and better estimates of expression levels, compared with other leading transcript assembly programs including Cufflinks, IsoLasso, Scripture and Traph.

View Article and Find Full Text PDF

TopHat is a popular spliced aligner for RNA-sequence (RNA-seq) experiments. In this paper, we describe TopHat2, which incorporates many significant enhancements to TopHat. TopHat2 can align reads of various lengths produced by the latest sequencing technologies, while allowing for variable-length indels with respect to the reference genome.

View Article and Find Full Text PDF

Recent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes and splice variants and quantify expression genome-wide in a single assay. The volume and complexity of data from RNA-seq experiments necessitate scalable, fast and mathematically principled analysis software. TopHat and Cufflinks are free, open-source software tools for gene discovery and comprehensive expression analysis of high-throughput mRNA sequencing (RNA-seq) data.

View Article and Find Full Text PDF

Background: Comparison of the human genome with other primates offers the opportunity to detect evolutionary events that created the diverse phenotypes among the primate species. Because the primate genomes are highly similar to one another, methods developed for analysis of more divergent species do not always detect signs of evolutionary selection.

Results: We have developed a new method, called DivE, specifically designed to find regions that have evolved either more or less rapidly than expected, for any clade within a set of very closely related species.

View Article and Find Full Text PDF

A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo). Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.

View Article and Find Full Text PDF

High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks.

View Article and Find Full Text PDF

Background: The genome of the domestic cow, Bos taurus, was sequenced using a mixture of hierarchical and whole-genome shotgun sequencing methods.

Results: We have assembled the 35 million sequence reads and applied a variety of assembly improvement techniques, creating an assembly of 2.86 billion base pairs that has multiple improvements over previous assemblies: it is more complete, covering more of the genome; thousands of gaps have been closed; many erroneous inversions, deletions, and translocations have been corrected; and thousands of single-nucleotide errors have been corrected.

View Article and Find Full Text PDF