Genotyping complex multigene families in novel systems is particularly challenging. Target primers frequently amplify simultaneously multiple loci leading to high PCR and sequencing artefacts such as chimeras and allele amplification bias. Most genotyping pipelines have been validated in nonmodel systems whereby the real genotype is unknown and the generation of artefacts may be highly repeatable. Further hindering accurate genotyping, the relationship between artefacts and genotype complexity (i.e. number of alleles per genotype) within a PCR remains poorly described. Here, we investigated the latter by experimentally combining multiple known major histocompatibility complex (MHC) haplotypes of a model organism (chicken, Gallus gallus, 43 artificial genotypes with 2-13 alleles per amplicon). In addition to well-defined 'optimal' primers, we simulated a nonmodel species situation by designing 'cross-species' primers based on sequence data from closely related Galliform species. We applied a novel open-source genotyping pipeline (ACACIA; https://gitlab.com/psc_santos/ACACIA), and compared its performance with another, previously published pipeline (AmpliSAS). Allele calling accuracy was higher when using ACACIA (98.5% versus 97% and 77.8% versus 75% for the 'optimal' and 'cross-species' data sets, respectively). Systematic allele dropout of three alleles owing to primer mismatch in the 'cross-species' data set explained high allele calling repeatability (100% when using ACACIA) despite low accuracy, demonstrating that repeatability can be misleading when evaluating genotyping workflows. Genotype complexity was positively associated with nonchimeric artefacts, chimeric artefacts (nonlinearly by levelling when amplifying more than 4-6 alleles) and allele amplification bias. Our study exemplifies and demonstrates pitfalls researchers should avoid to reliably genotype complex multigene families.

Download full-text PDF

Source
http://dx.doi.org/10.1111/1755-0998.13290DOI Listing

Publication Analysis

Top Keywords

multigene families
12
complex multigene
8
allele amplification
8
amplification bias
8
genotype complexity
8
allele calling
8
'cross-species' data
8
genotyping
6
artefacts
5
allele
5

Similar Publications

Background: Streptomyces roseochromogenes NRRL 3504 produces clorobiocin, an aminocoumarin antibiotic that inhibits DNA replication. No other natural products have been isolated from this bacterium so far, despite the presence of a rich repertoire of specialized metabolite biosynthesis gene clusters (smBGCs) within its genome. Heterologous expression of smBGCs in suitable chassis speeds up the discovery of the natural products hidden behind these sets of genes.

View Article and Find Full Text PDF

Genome-Wide Identification and Functional Characterization of Gene Family Reveal Its Involvement in Response to Stress in Cotton.

Int J Mol Sci

January 2025

Institute of Cotton, Hebei Academy of Agriculture and Forestry Sciences/Key Laboratory of Cotton Biology and Genetic Breeding in Huanghuaihai Semiarid Area, Ministry of Agriculture and Rural Affairs, Shijiazhuang 050000, China.

SKP1 constitutes the Skp1-Cullin-F-box ubiquitin E3 ligase (SCF), which plays a role in plant growth and development and biotic and abiotic stress in ubiquitination. However, the response of the gene family to abiotic and biotic stresses in cotton has not been well characterized. In this study, a total of 72 genes with the conserved domain of SKP1 were identified in four Gossypium species.

View Article and Find Full Text PDF

Systematic Analysis of Cotton RING E3 Ubiquitin Ligase Genes Reveals Their Potential Involvement in Salt Stress Tolerance.

Int J Mol Sci

January 2025

Key Laboratory of Cotton Breeding and Cultivation in Huang-Huai-Hai Plain, Ministry of Agriculture and Rural Affairs, Institute of Industrial Crops Shandong Academy of Agricultural Sciences, Jinan 250100, China.

The Really Interesting New Gene (RING) E3 ubiquitin ligases represent the largest class of E3 ubiquitin ligases involved in protein degradation and play a pivotal role in plant growth, development, and environmental responses. Despite extensive studies in numerous plant species, the functions of RING E3 ligases in cotton remain largely unknown. In this study, we performed systematic identification, characterization, and expression analysis of genes in cotton.

View Article and Find Full Text PDF

Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) is an enzyme widely involved in glycolysis in animal cells and in non-metabolic processes, including apoptosis and the regulation of gene expression. GAPDH is a ubiquitous protein that plays a pivotal role in plant metabolism and handling of stress responses. However, its function in plant stress resistance remains unknown.

View Article and Find Full Text PDF

Genome-Wide Analysis of the Hsf Gene Family in and Function in Thermotolerance.

Int J Mol Sci

December 2024

Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100093, China.

Heat shock transcription factors (Hsfs) play an important role in response to high temperatures by binding to the promoter of the heat shock protein gene to promote its expression. As an important ornamental plant, the rose often encounters heat stress during the flowering process. However, there are few studies on the family in roses ().

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!