A data-driven genome annotation approach for cassava.

Plant J

Department of Plant and Environmental Sciences, Copenhagen Plant Science Centre, University of Copenhagen, Thorvaldsensvej 40, Frederiskberg C, 1871, Denmark.

Published: August 2024

Genome annotation files play a critical role in dictating the quality of downstream analyses by providing essential predictions for gene positions and structures. These files are pivotal in decoding the complex information encoded within DNA sequences. Here, we generated experimental data resolving RNA 5'- and 3'-ends as well as full-length RNAs for cassava TME12 sticklings in ambient temperature and cold. We used these data to generate genome annotation files using the TranscriptomeReconstructoR (TR) tool. A careful comparison to high-quality genome annotations suggests that our new TR genome annotations identified additional genes, resolved the transcript boundaries more accurately and identified additional RNA isoforms. We enhanced existing cassava genome annotation files with the information from TR that maintained the different transcript models as RNA isoforms. The resultant merged annotation was subsequently utilized for comprehensive analysis. To examine the effects of genome annotation files on gene expression studies, we compared the detection of differentially expressed genes during cold using the same RNA-seq data but alternative genome annotation files. We found that our merged genome annotation that included cold-specific TR gene models identified about twice as many cold-induced genes. These data indicate that environmentally induced genes may be missing in off-the-shelf genome annotation files. In conclusion, TR offers the opportunity to enhance crop genome annotations with implications for the discovery of differentially expressed candidate genes during plant-environment interactions.

Download full-text PDF

Source
http://dx.doi.org/10.1111/tpj.16856DOI Listing

Publication Analysis

Top Keywords

genome annotation
32
annotation files
24
genome annotations
12
genome
10
annotation
9
cassava genome
8
identified additional
8
rna isoforms
8
differentially expressed
8
files
7

Similar Publications

Spatial transcriptomics (ST) offers enormous potential to decipher the biological and pathological heterogeneity in precious archival cancer tissues. Traditionally, these tissues have rarely been used and only examined at a low throughput, most commonly by histopathological staining. ST adds thousands of times as many molecular features to histopathological images, but critical technical issues and limitations require more assessment of how ST performs on fixed archival tissues.

View Article and Find Full Text PDF

How genetic variation contributes to adaptation at different environments is a central focus in evolutionary biology. However, most free-living species still lack a comprehensive understanding of the primary molecular mechanisms of adaptation. Here, we characterised the targets of selection associated with drastically different aquatic environments-humic and clear water-in the common freshwater fish, Eurasian perch (Perca fluviatilis).

View Article and Find Full Text PDF

Backgrounds: Collagen type I alpha 1 chain (COL1A1) is a key protein encoding fibrillar collagen, playing a crucial role in the tumor microenvironment (TME) due to its complex functions and close association with tumor invasiveness. This has made COL1A1 a focal point in cancer biology research. However, studies investigating the relationship between COL1A1 expression levels and clinical characteristics of ovarian cancer (OC) remain limited.

View Article and Find Full Text PDF

Nucleotide-binding domain leucine-rich repeat (NLR) proteins are a key component of the plant innate immune system. In plant genomes, NLRs exhibit considerable presence/absence variation and sequence diversity. Recent advances in sequencing technologies have made the generation of high-quality novel plant genome assemblies considerably more straightforward.

View Article and Find Full Text PDF

Gene model for the ortholog of glycogen synthase ( ) in the May 2017 (Princeton ASM75419v2/DsimGB2) Genome Assembly (GenBank Accession: GCA_000754195.3 ). This ortholog was characterized as part of a developing dataset to study the evolution of the Insulin/insulin-like growth factor signaling pathway (IIS) across the genus using the Genomics Education Partnership gene annotation protocol for Course-based Undergraduate Research Experiences.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!