Background: The presence of homoeologous sequences and absence of a reference genome sequence make discovery and genotyping of single nucleotide polymorphisms (SNPs) more challenging in polyploid crops.

Results: To address this challenge, we constructed reduced representation libraries (RRLs) for two Brassica napus inbred lines and their 91 doubled haploid (DH) progenies using a modified ddRADseq technique. A bioinformatics pipeline termed RFAPtools was developed to discover and genotype SNPs and presence/absence variations (PAVs). Using this pipeline, a pseudo-reference sequence (PRF) containing 180,991 sequence tags was constructed. By aligning sequence reads to the pseudo-reference sequence, allelic SNPs as well as PAVs were identified and genotyped with RFAPtools. Two parallel linkage maps, one SNP bin map containing 8,780 SNP loci and one PAV linkage map containing 12,423 dominant loci, were constructed. By aligning marker sequences to B. rapa sequence scaffolds, whose genome is available, we assigned 44 unassembled sequence scaffolds comprising 8.15 Mb onto the B. rapa chromosomes, and also identified 14 instances of misassembly and eight instances of mis-ordering sequence scaffolds.

Conclusions: These results indicate that the modified ddRADseq approach is a cost-effective and simple method to genotype tens of thousands SNPs and PAV markers in a polyploidy plant species. The results also demonstrated that RFAPtools developed in this study are powerful to mine allelic SNPs from homoeologous sequences in polyploids, therefore they are generally applicable in either diploid or polyploid species with or without a reference genome sequence.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3665465PMC
http://dx.doi.org/10.1186/1471-2164-14-346DOI Listing

Publication Analysis

Top Keywords

pseudo-reference sequence
12
sequence
10
brassica napus
8
homoeologous sequences
8
reference genome
8
genome sequence
8
modified ddradseq
8
rfaptools developed
8
constructed aligning
8
allelic snps
8

Similar Publications

Copy number variants (CNVs) are prevalent in both diploid and haploid genomes, with the latter containing a single copy of each gene. Studying CNVs in genomes from single or few cells is significantly advancing our knowledge in human disorders and disease susceptibility. Low-input including low-cell and single-cell sequencing data for haploid and diploid organisms generally displays shallow and highly non-uniform read counts resulting from the whole genome amplification steps that introduce amplification biases.

View Article and Find Full Text PDF

Our goal was to assess the accuracy of next generation sequencing (NGS) compared with Sanger. We performed single genome amplification (SGA) of HIV-1 on extracted tissue DNA from two HIV+ individuals. Amplicons ( = 30) were sequenced with Sanger or reamplified with barcoded primers and pooled before sequencing using Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PB).

View Article and Find Full Text PDF

Improving high-resolution copy number variation analysis from next generation sequencing using unique molecular identifiers.

BMC Bioinformatics

March 2021

INSERM U1245, Team Genomics and Biomarkers of Lymphoma and Solid Tumors, Normandie Univ, UNIROUEN, Rouen, France.

Article Synopsis
  • Copy number variations (CNV) are crucial genetic alterations linked to cancer, affecting oncogenes and tumor suppressors; new sequencing techniques using unique molecular identifiers (UMI) enhance CNV detection.
  • The study introduces a novel method called molecular Copy Number Alteration (mCNA), which employs UMI and a four-step algorithm to accurately identify copy number changes in cancer samples.
  • mCNA has shown strong correlation with existing genomic methods and is made publicly available, promising improved accuracy in detecting CNV changes in cancer research.
View Article and Find Full Text PDF

Background: The emergence of the third generation sequencing technology, featuring longer read lengths, has demonstrated great advancement compared to the next generation sequencing technology and greatly promoted the biological research. However, the third generation sequencing data has a high level of the sequencing error rates, which inevitably affects the downstream analysis. Although the issue of sequencing error has been improving these years, large amounts of data were produced at high sequencing errors, and huge waste will be caused if they are discarded.

View Article and Find Full Text PDF

Genotyping by sequencing for SNP marker development in onion.

Genome

December 2020

Vegetable Crop Research Unit, United States Department of Agriculture, Agricultural Research Service, 1575 Linden Drive, Madison, WI 53706, USA.

Onion () is not highly tractable for development of molecular markers due to its large (16 gigabases per 1C) nuclear genome. Single nucleotide polymorphisms (SNPs) are useful for genetic characterization and marker-aided selection of onion because of codominance and common occurrence in elite germplasm. We completed genotyping by sequencing (GBS) to identify SNPs in onion using 46 F plants, parents of the F plants (Ailsa Craig 43 and Brigham Yellow Globe 15-23), two doubled haploid (DH) lines (DH2107 and DH2110), and plants from 94 accessions in the USDA National Plant Germplasm System (NPGS).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!