AI Article Synopsis

  • The study optimizes a bioinformatics pipeline for annotating complex fungal genomes using RNA-seq data, focusing on pathogenic yeasts Cryptococcus neoformans and Cryptococcus deneoformans.
  • The quality of the annotation is heavily influenced by the quantity of RNA-seq reads, with optimal results achieved at 5-10 million reads per replicate; the number of predicted introns serves as an effective indicator of annotation quality.
  • Dynamic transcriptome analysis of the RNAi-deficient species, Cryptococcus deuterogattii, shows significant intron retention compared to its RNAi-proficient counterparts, while gene content analysis reveals the loss of key transcription factors and potential adaptive evolution in metabolite assimilation.

Article Abstract

Evaluating the quality of a de novo annotation of a complex fungal genome based on RNA-seq data remains a challenge. In this study, we sequentially optimized a Cufflinks-CodingQuary-based bioinformatics pipeline fed with RNA-seq data using the manually annotated model pathogenic yeasts Cryptococcus neoformans and Cryptococcus deneoformans as test cases. Our results show that the quality of the annotation is sensitive to the quantity of RNA-seq data used and that the best quality is obtained with 5-10 million reads per RNA-seq replicate. We also showed that the number of introns predicted is an excellent a priori indicator of the quality of the final de novo annotation. We then used this pipeline to annotate the genome of the RNAi-deficient species Cryptococcus deuterogattii strain R265 using RNA-seq data. Dynamic transcriptome analysis revealed that intron retention is more prominent in C. deuterogattii than in the other RNAi-proficient species C. neoformans and C. deneoformans. In contrast, we observed that antisense transcription was not higher in C. deuterogattii than in the two other Cryptococcus species. Comparative gene content analysis identified 21 clusters enriched in transcription factors and transporters that have been lost. Interestingly, analysis of the subtelomeric regions in these three annotated species identified a similar gene enrichment, reminiscent of the structure of primary metabolic clusters. Our data suggest that there is active exchange between subtelomeric regions, and that other chromosomal regions might participate in adaptive diversification of Cryptococcus metabolite assimilation potential.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8022950PMC
http://dx.doi.org/10.1093/g3journal/jkaa070DOI Listing

Publication Analysis

Top Keywords

rna-seq data
16
annotation pipeline
8
cryptococcus deuterogattii
8
primary metabolic
8
novo annotation
8
subtelomeric regions
8
cryptococcus
6
rna-seq
5
data
5
application optimized
4

Similar Publications

Background: Bioinformatics is fundamental to biomedical sciences, but its mastery presents a steep learning curve for bench biologists and clinicians. Learning to code while analyzing data is difficult. The curve may be flattened by separating these two aspects and providing intermediate steps for budding bioinformaticians.

View Article and Find Full Text PDF

Objective: Ovarian cancer significantly impacts women's reproductive health and remains challenging to diagnose and treat. Despite advancements in understanding DNA repair mechanisms and identifying novel therapeutic targets, additional strategies are still needed. Recently, a novel form of cell death called disulfidptosis, which is triggered by glucose deprivation, has been linked to treatment resistance and changes in the tumor microenvironment (TME).

View Article and Find Full Text PDF

SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering.

PLoS One

January 2025

Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy.

A Directed Acyclic Graph (DAG) offers an easy approach to define causal structures among gathered nodes: causal linkages are represented by arrows between the variables, leading from cause to effect. Recently, industry and academics have paid close attention to DAG structure learning from observable data, and many techniques have been put out to address the problem. We provide a two-step approach, named SEMdag(), that can be used to quickly learn high-dimensional linear SEMs.

View Article and Find Full Text PDF

The complex gene regulatory landscape underlying early flower development in Arabidopsis has been extensively studied through transcriptome profiling, and gene networks controlling floral organ development have been derived from the analyses of genome wide binding of key transcription factors. In contrast, the dynamic nature of the proteome during the flower development process is much less understood. In this study, we characterized the floral proteome at different stages during early flower development and correlated it with unbiased transcript expression data.

View Article and Find Full Text PDF

RNA tomography computationally reconstructs 3D spatial gene expression patterns genome-widely from 1D tomo-seq data, generated by RNA sequencing of cryosection samples along three orthogonal axes. We developed tomoseqr, an R package designed for RNA tomography analysis of tomo-seq data, to reconstruct and visualize 3D gene expression patterns through user-friendly graphical interfaces. We show the effectiveness of tomoseqr using simulated and real tomo-seq data, validating its utility for researchers.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!