Genotyping by sequencing for genomic prediction in a soybean breeding population.

Diego Jarquín Kyle Kocak Luis Posadas Katie Hyma Joseph Jedlicka George Graef Aaron Lorenz

BMC Genomics

Department of Agronomy and Horticulture, University of Nebraska, 363 Keim Hall, Lincoln, NE 68583, USA.

Published: August 2014

Advances in genotyping technology, particularly genotyping by sequencing (GBS), could enhance genomic prediction in soybean breeding, aiming to reduce breeding times and costs associated with phenotyping.
The study successfully genotyped 301 soybean lines, identifying a significant number of single nucleotide polymorphisms (SNPs) and demonstrating a prediction accuracy of 0.64 for grain yield using GBS, indicating strong potential for genomic selection in soybeans.
Filtering SNPs based on missing data had minimal impact on prediction accuracy, with random forest imputation yielding the best results, and the accuracy of genomic predictions increased with larger training populations, particularly when including SNPs with higher minor-allele frequencies.

Background: Advances in genotyping technology, such as genotyping by sequencing (GBS), are making genomic prediction more attractive to reduce breeding cycle times and costs associated with phenotyping. Genomic prediction and selection has been studied in several crop species, but no reports exist in soybean. The objectives of this study were (i) evaluate prospects for genomic selection using GBS in a typical soybean breeding program and (ii) evaluate the effect of GBS marker selection and imputation on genomic prediction accuracy. To achieve these objectives, a set of soybean lines sampled from the University of Nebraska Soybean Breeding Program were genotyped using GBS and evaluated for yield and other agronomic traits at multiple Nebraska locations.

Results: Genotyping by sequencing scored 16,502 single nucleotide polymorphisms (SNPs) with minor-allele frequency (MAF) > 0.05 and percentage of missing values ≤ 5% on 301 elite soybean breeding lines. When SNPs with up to 80% missing values were included, 52,349 SNPs were scored. Prediction accuracy for grain yield, assessed using cross validation, was estimated to be 0.64, indicating good potential for using genomic selection for grain yield in soybean. Filtering SNPs based on missing data percentage had little to no effect on prediction accuracy, especially when random forest imputation was used to impute missing values. The highest accuracies were observed when random forest imputation was used on all SNPs, but differences were not significant. A standard additive G-BLUP model was robust; modeling additive-by-additive epistasis did not provide any improvement in prediction accuracy. The effect of training population size on accuracy began to plateau around 100, but accuracy steadily climbed until the largest possible size was used in this analysis. Including only SNPs with MAF > 0.30 provided higher accuracies when training populations were smaller.

Conclusions: Using GBS for genomic prediction in soybean holds good potential to expedite genetic gain. Our results suggest that standard additive G-BLUP models can be used on unfiltered, imputed GBS data without loss in accuracy.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4176594	PMC
http://dx.doi.org/10.1186/1471-2164-15-740	DOI Listing

Publication Analysis

Top Keywords

genomic prediction

soybean breeding

prediction accuracy

genotyping sequencing

prediction

soybean

prediction soybean

genomic selection

breeding program

missing values

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!