Automated annotation of human centromeres with HORmon.

Genome Res

Department of Computer Science and Engineering, University of California, San Diego, California 92093, USA.

Published: June 2022

Recent advances in long-read sequencing opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. They also emphasized the need for centromere annotation (partitioning human centromeres into monomers and higher-order repeats [HORs]). Although there was a half-century-long series of semi-manual studies of centromere architecture, a rigorous centromere annotation algorithm is still lacking. Moreover, an automated centromere annotation is a prerequisite for studies of genetic diseases associated with centromeres and evolutionary studies of centromeres across multiple species. Although the monomer decomposition (transforming a centromere into a monocentromere written in the monomer alphabet) and the HOR decomposition (representing a monocentromere in the alphabet of HORs) are currently viewed as two separate problems, we show that they should be integrated into a single framework in such a way that HOR (monomer) inference affects monomer (HOR) inference. We thus developed the HORmon algorithm that integrates the monomer/HOR inference and automatically generates the human monomers/HORs that are largely consistent with the previous semi-manual inference.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9248890PMC
http://dx.doi.org/10.1101/gr.276362.121DOI Listing

Publication Analysis

Top Keywords

human centromeres
12
centromere annotation
12
centromeres
5
centromere
5
automated annotation
4
human
4
annotation human
4
centromeres hormon
4
hormon advances
4
advances long-read
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!