High levels of heterozygosity present a unique genome assembly challenge and can adversely impact downstream analyses, yet is common in sequencing datasets obtained from non-model organisms. Here we show that by re-assembling a heterozygous dataset with variant parameters and different assembly algorithms, we are able to generate assemblies whose protein annotations are statistically enriched for specific gene ontology categories. While total assembly length was not significantly affected by assembly methodologies tested, the assemblies generated varied widely in fragmentation level and we show local assembly collapse or expansion underlying the enrichment or depletion of specific protein functional groups. We show that these statistically significant deviations in gene ontology groups can occur in seemingly high-quality assemblies, and result from difficult-to-detect local sequence expansion or contractions. Given the unpredictable interplay between assembly algorithm, parameter, and biological sequence data heterozygosity, we highlight the need for better measures of assembly quality than N50 value, including methods for assessing local expansion and collapse.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7423139PMC
http://dx.doi.org/10.1371/journal.pcbi.1008104DOI Listing

Publication Analysis

Top Keywords

sequence expansion
8
expansion collapse
8
gene ontology
8
assembly
7
regional sequence
4
expansion
4
collapse heterozygous
4
heterozygous genome
4
assemblies
4
genome assemblies
4

Similar Publications

Single-Cell Sequencing of Peripheral Blood Mononuclear Cells Reveals Immune Landscape of Monkeypox Patients with HIV.

Emerg Microbes Infect

January 2025

State Key Laboratory of Experimental Hematology, Department of Physiology and Pathophysiology, Tianjin Medical University, Heping, Tianjin, 300070 China.

The monkeypox (MPXV) outbreak in 2022 is more prevalent among individuals with human immunodeficiency virus (HIV). While it is plausible that HIV-induced immunosuppression could result in a more severe progression, the exact mechanisms remain undetermined. To better understand the immunopathology of MPXV in patients with and without HIV infection, we employed single-cell RNA sequencing (scRNA-seq) to analyze peripheral blood mononuclear cells (PBMCs) from 6 patients hospitalized for MPXV, 3 of whom had HIV infection (HIV antibody positive & HIV RNA level below the detection limit), and 3 patients only infected with MPXV (HIV-).

View Article and Find Full Text PDF

infects the urogenital tract of men and women and causes the sexually transmitted infection trichomoniasis. Since the publication of its draft genome in 2007, the genome has drawn attention for several reasons, including its unusually large size, massive expansion of gene families, and high repeat content. The fragmented nature of the draft assembly made it challenging to obtain accurate metrics of features, such as spliceosomal introns.

View Article and Find Full Text PDF

Tandem repeats are a highly polymorphic class of genomic variation that play causal roles in rare diseases but are notoriously difficult to sequence using short-read techniques . Most previous studies profiling tandem repeats genome-wide have reduced the description of each locus to the singular value of the length of the entire repetitive locus . Here we introduce a comprehensive database of 3.

View Article and Find Full Text PDF

CD56 CD16 cells represent a distinct mature NK cell subset with altered phenotype and are associated with adverse clinical outcome upon expansion in AML.

Front Immunol

January 2025

Team Immunity and Cancer, Cancer Research Center of Marseille (CRCM), Inserm U1068, CNRS UMR7258, Paoli-Calmettes Institute, University of Aix-Marseille UM105, Marseille, France.

Introduction: Acute myeloid leukemia (AML) is a rare haematological cancer with poor 5-years overall survival (OS) and high relapse rate. Leukemic cells are sensitive to Natural Killer (NK) cell mediated killing. However, NK cells are highly impaired in AML, which promote AML immune escape from NK cell immune surveillance.

View Article and Find Full Text PDF

Clonal hematopoiesis of indeterminate potential (CHIP) is associated with increased mortality and malignancy risk, yet the determinants of clonal expansion remain poorly understood. We performed sequencing at >4,000x depth of coverage for CHIP mutations in 6,986 postmenopausal women from the Women's Health Initiative at two timepoints approximately 15 years apart. Among 3,685 mutations detected at baseline (VAF ≥ 0.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!