CpG Island Definition and Methylation Mapping of the T2T-YAO Genome.

Genomics Proteomics Bioinformatics

College of Computer Science, Sichuan University, Chengdu 610065, China.

Published: July 2024

Precisely defining and mapping all cytosine (C) positions and their clusters, known as CpG islands (CGIs), as well as their methylation status, are pivotal for genome-wide epigenetic studies, especially when population-centric reference genomes are ready for timely application. Here, we first align the two high-quality reference genomes, T2T-YAO and T2T-CHM13, from different ethnic backgrounds in a base-by-base fashion and compute their genome-wide density-defined and position-defined CGIs. Second, by mapping some representative genome-wide methylation data from selected organs onto the two genomes, we find that there are about 4.7%-5.8% sequence divergency of variable categories depending on quality cutoffs. Genes among the divergent sequences are mostly associated with neurological functions. Moreover, CGIs associated with the divergent sequences are significantly different with respect to CpG density and observed CpG/expected CpG (O/E) ratio between the two genomes. Finally, we find that the T2T-YAO genome not only has a greater CpG coverage than that of the T2T-CHM13 genome when whole-genome bisulfite sequencing (WGBS) data from the European and American populations are mapped to each reference, but also shows more hyper-methylated CpG sites as compared to the T2T-CHM13 genome. Our study suggests that future genome-wide epigenetic studies of the Chinese populations rely on both acquisition of high-quality methylation data and subsequent precision CGI mapping based on the Chinese T2T reference.

Download full-text PDF

Source
http://dx.doi.org/10.1093/gpbjnl/qzae009DOI Listing

Publication Analysis

Top Keywords

t2t-yao genome
8
genome-wide epigenetic
8
epigenetic studies
8
reference genomes
8
methylation data
8
divergent sequences
8
t2t-chm13 genome
8
cpg
6
cpg island
4
island definition
4

Similar Publications

Article Synopsis
  • Whole-exome sequencing (WES) is essential for cancer diagnosis, but the current reference genome, GRCh38, is being challenged by new telomere-to-telomere genomes, T2T-CHM13 and T2T-YAO.
  • A study of 19 tumor samples from Chinese patients found that T2T-YAO showed about 1% more diversity in exon regions compared to GRCh38, yet it provided 7.41% more mapped reads, indicating better performance.
  • The results suggest using population-specific reference genomes like T2T-YAO is crucial for accurate variant analysis, as it reduces false positive variant calls and improves the identification of pathogenic variants in different ethnic groups.
View Article and Find Full Text PDF

CpG Island Definition and Methylation Mapping of the T2T-YAO Genome.

Genomics Proteomics Bioinformatics

July 2024

College of Computer Science, Sichuan University, Chengdu 610065, China.

Precisely defining and mapping all cytosine (C) positions and their clusters, known as CpG islands (CGIs), as well as their methylation status, are pivotal for genome-wide epigenetic studies, especially when population-centric reference genomes are ready for timely application. Here, we first align the two high-quality reference genomes, T2T-YAO and T2T-CHM13, from different ethnic backgrounds in a base-by-base fashion and compute their genome-wide density-defined and position-defined CGIs. Second, by mapping some representative genome-wide methylation data from selected organs onto the two genomes, we find that there are about 4.

View Article and Find Full Text PDF

T2T-YAO Reference Genome of Han Chinese - New Step in Advancing Precision Medicine in China.

Genomics Proteomics Bioinformatics

December 2023

McKusick-Zhang Center for Genetic Medicine, State Key Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College (CAMS&PUMC), Beijing 100005, China. Electronic address:

View Article and Find Full Text PDF

T2T-YAO, T2T-SHUN, and more.

Genomics Proteomics Bioinformatics

December 2023

CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China. Electronic address:

View Article and Find Full Text PDF

T2T-YAO: A Telomere-to-telomere Assembled Diploid Reference Genome for Han Chinese.

Genomics Proteomics Bioinformatics

December 2023

Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Institute of Chest and Lung Diseases, Shanxi Medical University, Taiyuan 030001, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China. Electronic address:

Since its initial release in 2001, the human reference genome has undergone continuous improvement in quality, and the recently released telomere-to-telomere (T2T) version - T2T-CHM13 - reaches its highest level of continuity and accuracy after 20 years of effort by working on a simplified, nearly homozygous genome of a hydatidiform mole cell line. Here, to provide an authentic complete diploid human genome reference for the Han Chinese, the largest population in the world, we assembled the genome of a male Han Chinese individual, T2T-YAO, which includes T2T assemblies of all the 22 + X + M and 22 + Y chromosomes in both haploids. The quality of T2T-YAO is much better than those of all currently available diploid assemblies, and its haploid version, T2T-YAO-hp, generated by selecting the better assembly for each autosome, reaches the top quality of fewer than one error per 29.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!