Systematic analysis of 1298 RNA-Seq samples and construction of a comprehensive soybean (Glycine max) expression atlas.

Plant J

Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Campos dos Goytacazes, Brazil.

Published: August 2020

AI Article Synopsis

  • The study focuses on soybean (Glycine max) as a key crop important for animal feed and human nutrition due to its high protein and oil content, and reviews recent advances in soybean transcriptome data collection and analysis.
  • Researchers processed data from 1298 public soybean transcriptome samples, finding that 94% of the known genes showed expression, and revealing distinct expression patterns that categorize genes into groups based on tissue types.
  • The resulting dataset, which includes identified housekeeping genes and potential novel gene variants, is available for public use and aims to enhance research in soybean genetics and genomics through a comprehensive online atlas.

Article Abstract

Soybean (Glycine max [L.] Merr.) is a major crop in animal feed and human nutrition, mainly for its rich protein and oil contents. The remarkable rise in soybean transcriptome studies over the past 5 years generated an enormous amount of RNA-seq data, encompassing various tissues, developmental conditions and genotypes. In this study, we have collected data from 1298 publicly available soybean transcriptome samples, processed the raw sequencing reads and mapped them to the soybean reference genome in a systematic fashion. We found that 94% of the annotated genes (52 737/56 044) had detectable expression in at least one sample. Unsupervised clustering revealed three major groups, comprising samples from aerial, underground and seed/seed-related parts. We found 452 genes with uniform and constant expression levels, supporting their roles as housekeeping genes. On the other hand, 1349 genes showed heavily biased expression patterns towards particular tissues. A transcript-level analysis revealed that 95% (70 963 of 74 490) of the assembled transcripts have intron chains exactly matching those from known transcripts, whereas 3256 assembled transcripts represent potentially novel splicing isoforms. The dataset compiled here constitute a new resource for the community, which can be downloaded or accessed through a user-friendly web interface at http://venanciogroup.uenf.br/resources/. This comprehensive transcriptome atlas will likely accelerate research on soybean genetics and genomics.

Download full-text PDF

Source
http://dx.doi.org/10.1111/tpj.14850DOI Listing

Publication Analysis

Top Keywords

soybean glycine
8
glycine max
8
soybean transcriptome
8
assembled transcripts
8
soybean
6
systematic analysis
4
analysis 1298
4
1298 rna-seq
4
rna-seq samples
4
samples construction
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!