Publications by authors named "Timofey Prodanov"

Article Synopsis
  • * It achieves a high level of completeness, closing 92% of previous assembly gaps and fully assembling complex regions, including 1,852 complex structural variants and 1,246 human centromeres.
  • * The findings lead to significant improvements in genotyping accuracy and enable the detection of over 26,000 structural variants per sample, enhancing the potential for future disease association research.
View Article and Find Full Text PDF
Article Synopsis
  • MUC5AC and MUC5B are special proteins that help protect our bodies by catching germs and helping us clear mucus!
  • Researchers studied the differences in these proteins by looking at DNA from humans and primates and found that MUC5B is mostly the same in humans, while MUC5AC has many variations!
  • The study also showed that people from East Asia have unique versions of the MUC5AC protein that might have helped them in survival, while another version is more common in Europeans!
View Article and Find Full Text PDF

Structural variants (SVs) contribute significantly to human genetic diversity and disease . Previously, SVs have remained incompletely resolved by population genomics, with short-read sequencing facing limitations in capturing the whole spectrum of SVs at nucleotide resolution . Here we leveraged nanopore sequencing to construct an intermediate coverage resource of 1,019 long-read genomes sampled within 26 human populations from the 1000 Genomes Project.

View Article and Find Full Text PDF

The secreted mucins MUC5AC and MUC5B play critical defensive roles in airway pathogen entrapment and mucociliary clearance by encoding large glycoproteins with variable number tandem repeats (VNTRs). These polymorphic and degenerate protein coding VNTRs make the loci difficult to investigate with short reads. We characterize the structural diversity of and by long-read sequencing and assembly of 206 human and 20 nonhuman primate (NHP) haplotypes.

View Article and Find Full Text PDF

Motivation: Low-copy repeats (LCRs) or segmental duplications are long segments of duplicated DNA that cover > 5% of the human genome. Existing tools for variant calling using short reads exhibit low accuracy in LCRs due to ambiguity in read mapping and extensive copy number variation. Variants in more than 150 genes overlapping LCRs are associated with risk for human diseases.

View Article and Find Full Text PDF

The human genome contains hundreds of low-copy repeats (LCRs) that are challenging to analyze using short-read sequencing technologies due to extensive copy number variation and ambiguity in read mapping. Copy number and sequence variants in more than 150 duplicated genes that overlap LCRs have been implicated in monogenic and complex human diseases. We describe a computational tool, Parascopy, for estimating the aggregate and paralog-specific copy number of duplicated genes using whole-genome sequencing (WGS).

View Article and Find Full Text PDF

The ability to characterize repetitive regions of the human genome is limited by the read lengths of short-read sequencing technologies. Although long-read sequencing technologies such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies can potentially overcome this limitation, long segmental duplications with high sequence identity pose challenges for long-read mapping. We describe a probabilistic method, DuploMap, designed to improve the accuracy of long-read mapping in segmental duplications.

View Article and Find Full Text PDF

Background: Cystic fibrosis (CF) is one of the most common life-threatening genetic disorders. Around 2000 variants in the CFTR gene have been identified, with some proportion known to be pathogenic and 300 disease-causing mutations have been characterized in detail by CFTR2 database, which complicates its analysis with conventional methods.

Methods: We conducted next-generation sequencing (NGS) in a cohort of 89 adult patients negative for p.

View Article and Find Full Text PDF