Towards accurate, contiguous and complete alignment-based polyploid phasing algorithms.

Genomics

Université de Strasbourg, CNRS, GMGM UMR, 7156 Strasbourg, France; Institut Universitaire de France (IUF), Paris, France. Electronic address:

Published: May 2022

Phasing, and in particular polyploid phasing, have been challenging problems held back by the limited read length of high-throughput short read sequencing methods which can't overcome the distance between heterozygous sites and labor high cost of alternative methods such as the physical separation of chromosomes for example. Recently developed single molecule long-read sequencing methods provide much longer reads which overcome this previous limitation. Here we review the alignment-based methods of polyploid phasing that rely on four main strategies: population inference methods, which leverage the genetic information of several individuals to phase a sample; objective function minimization methods, which minimize a function such as the Minimum Error Correction (MEC); graph partitioning methods, which represent the read data as a graph and split it into k haplotype subgraphs; cluster building methods, which iteratively grow clusters of similar reads into a final set of clusters that represent the haplotypes. We discuss the advantages and limitations of these methods and the metrics used to assess their performance, proposing that accuracy and contiguity are the most meaningful metrics. Finally, we propose the field of alignment-based polyploid phasing would greatly benefit from the use of a well-designed benchmarking dataset with appropriate evaluation metrics. We consider that there are still significant improvements which can be achieved to obtain more accurate and contiguous polyploid phasing results which reflect the complexity of polyploid genome architectures.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ygeno.2022.110369DOI Listing

Publication Analysis

Top Keywords

polyploid phasing
20
methods
9
accurate contiguous
8
alignment-based polyploid
8
sequencing methods
8
polyploid
6
phasing
6
contiguous complete
4
complete alignment-based
4
phasing algorithms
4

Similar Publications

With advances in long-read sequencing and assembly techniques, haplotype-resolved (phased) genome assemblies are becoming more common, also in the field of plant genomics. Computational tools to effectively explore these phased genomes, particularly for polyploid genomes, are currently limited. Here we describe a new strategy adopting a pangenome approach.

View Article and Find Full Text PDF

is an amphicarpic plant in the Brassicaceae family. Plants develop two fruit types, one above and another below ground. This rare trait is associated with octoploidy in .

View Article and Find Full Text PDF

SpudDB: A database for accessing potato genomic data.

Genetics

December 2024

Department of Crop & Soil Sciences, University of Georgia, Athens, GA, 30602, USA.

Potato is a key food crop with a complex, polyploid genome. Advancements in sequencing technologies coupled with improvements in genome assembly algorithms have enabled generation of phased, chromosome-scale genome assemblies for cultivated tetraploid potato. The SpudDB database houses potato genome sequence and annotation, with the doubled monoploid DM 1-3 516 R44 (hereafter DM) genome serving as the reference genome and haplotype.

View Article and Find Full Text PDF

Background: The allo-octoploid Fragaria x ananassa follows disomic inheritance, yet the high sequence similarity among its subgenomes can lead to misalignment of short sequencing reads (150 bp). This misalignment results in an increased number of erroneous variants during variant calling. To accurately associate traits with the appropriate subgenome, it is essential to filter out these erroneous variants.

View Article and Find Full Text PDF

Background And Aims: Polyploidy is an important evolutionary driver for plants and has been linked with higher species richness and increases in diversification rate. These correlations of ploidy with plant radiations could be the result of polyploid lineages exploiting greater niche space and novel niches due to their enhanced adaptability. Ploidy evolution and how it links with diversification of plants across the Australian continent is not well understood.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!