Background: The continuing increase in size and quality of the "short reads" raw data is a significant help for the quality of the assembly obtained through various bioinformatics tools. However, building a reference genome sequence for most plant species remains a significant challenge due to the large number of repeated sequences which are problematic for a whole-genome quality de novo assembly. Furthermore, for most SNP identification approaches in plant genetics and breeding, only the "Gene-space" regions including the promoter, exon and intron sequences are considered.
Results: We developed the iPea protocol to produce a de novo Gene-space assembly by reconstructing, in an iterative way, the non-coding sequence flanking the Unigene cDNA sequence through addition of next-generation DNA-seq data. The approach was elaborated with the large diploid genome of pea (Pisum sativum L.), rich in repetitive sequences. The final Gene-space assembly included 35,400 contigs (97 Mb), covering 88 % of the 40,227 contigs (53.1 Mb) of the PsCam_low-copy Unigen set. Its accuracy was validated by the results of the built GenoPea 13.2 K SNP Array.
Conclusion: The iPEA protocol allows the reconstruction of a Gene-space based from RNA-Seq and DNA-seq data with limited computing resources.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4750290 | PMC |
http://dx.doi.org/10.1186/s13104-016-1903-z | DOI Listing |
BMC Public Health
December 2017
School of Medicine - Department of Obstetrics and Gynaecology, Queen's University, 99 University Ave, Kingston, ON, K7L 3N6, Canada.
Background: Zika infection during pregnancy (ZIKVP) is known to be associated with adverse outcomes. Studies on this matter involve both rare outcomes and rare exposures and methodological choices are not straightforward. Cohort studies will surely offer more robust evidences, but their efficiency must be enhanced.
View Article and Find Full Text PDFBMC Res Notes
February 2016
INRA Institut National de la Recherche Agronomique, US1279 Etude du Polymorphisme des génomes Végétaux, CEA-IG/CNG Centre National de Génotypage, 2 rue Gaston Crémieux, 91057, Evry, France.
Background: The continuing increase in size and quality of the "short reads" raw data is a significant help for the quality of the assembly obtained through various bioinformatics tools. However, building a reference genome sequence for most plant species remains a significant challenge due to the large number of repeated sequences which are problematic for a whole-genome quality de novo assembly. Furthermore, for most SNP identification approaches in plant genetics and breeding, only the "Gene-space" regions including the promoter, exon and intron sequences are considered.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!