Genomic regions of high segmental duplication content and/or structural variation have led to gaps and misassemblies in the human reference sequence, and are refractory to assembly from whole-genome short-read datasets. Human subtelomere regions are highly enriched in both segmental duplication content and structural variations, and as a consequence are both impossible to assemble accurately and highly variable from individual to individual. Recently, we developed a pipeline for improved region-specific assembly called Regional Extension of Assemblies Using Linked-Reads (REXTAL). In this study, we evaluate REXTAL and genome-wide assembly (Supernova) approaches on 10X Genomics linked-reads data sets partitioned and barcoded using the Gel Bead in Emulsion (GEM) microfluidic method. Our results describe the accuracy and relative performance of these two approaches using the reference-based assessment module of QUAST. We show that REXTAL dramatically outperforms the Supernova whole genome assembler in subtelomeric segmental duplication regions, and results in highly accurate assemblies. Nearly all of the REXTAL "misassemblies" identified using default QUAST parameters simply pinpoint locations of tandem repeat arrays in the reference sequence where the repeat array length differs from that in the cognate REXTAL assembly by 1000 bp.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6940546PMC
http://dx.doi.org/10.1109/TCBB.2019.2913845DOI Listing

Publication Analysis

Top Keywords

segmental duplication
12
duplication content
8
reference sequence
8
regions highly
8
rextal
6
analysis subtelomeric
4
subtelomeric rextal
4
rextal assemblies
4
assemblies quast
4
quast genomic
4

Similar Publications

Summary: Gene and genome duplications are major evolutionary forces that shape the diversity and complexity of life. However, different duplication modes have distinct impacts on gene function, expression, and regulation. Existing tools for identifying and classifying duplicated genes are either outdated or not user-friendly.

View Article and Find Full Text PDF

Genome-Wide Identification of - () Gene Family and the Potential Function of Under Salt Stress in .

Int J Mol Sci

January 2025

Zhejiang Province Key Laboratory of Plant Secondary Metabolism and Regulation, College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Hangzhou 310018, China.

is an important medicinal herb known as a "natural antibiotic", which has been used in Southeast Asia for thousands of years. The () gene is an important regulatory factor for plant photoperiod flowering and stress response. However, there is currently no detailed research on the genes of .

View Article and Find Full Text PDF

Background: , a winter annual grass weed native to Eastern Europe and Western Asia, has become a widespread invasive species in the wheat-growing regions of China due to its high environmental adaptability. This study aims to explore the molecular mechanisms underlying the stress resistance of Tausch's goatgrass, focusing on the gene family.

Methods: A genome-wide analysis was conducted to identify and characterize the gene family in .

View Article and Find Full Text PDF

Background: Magnesium (Mg) is essential for plant growth and development and plays critical roles in physiological and biochemical processes. Mg deficiency adversely affects growth of plants by limiting shoot and root development, disturbing the structure and membranes of the grana, reducing photosynthesis efficiency, and lowering net CO assimilation. The MGT (Magnesium transporter) family is responsible for the absorption and transportation of magnesium in plants.

View Article and Find Full Text PDF

An evolutionary dynamics analysis of the plant DEK gene family reveals the role of BnaA02g08940D in drought tolerance.

Int J Biol Macromol

January 2025

School of Life Sciences, Jiangsu University, Zhenjiang 212013, Jiangsu, China. Electronic address:

DEK is a chromatin protein that interacts with DNA to influence chromatin formation, thereby affecting plant growth, development, and stress response. This study investigates the molecular evolution of the DEK family in plants, with a particular focus on the Brassica species. A total of 127 DEK genes were identified in 34 plants and classified into seven groups based on the phylogenetic analysis.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!