Although repetitive DNA forms much of the human genome, its study is challenging due to limitations in assembly and alignment of repetitive short-reads. We have deployed k-Seek, software that detects tandem repeats embedded in single reads, on 2,504 human genomes from the 1,000 Genomes Project to quantify the variation and abundance of simple satellites (repeat units <20 bp). We find that the ancestral monomer of Human Satellite 3 makes up the largest portion of simple satellite content in humans (mean of ∼8 Mb). We discovered ∼50,000 rare tandem repeats that are not detected in the T2T-CHM13v2.0 assembly, including undescribed variants of telomericand pericentromeric repeats. We find broad homogeneity of the most abundant repeats across populations, except for AG-rich repeats which are more abundant in African individuals. We also find cliques of highly similar AG- and AT-rich satellites that are interspersed and form higher-order structures that covary in copy number across individuals, likely through concerted amplification via unequal exchange. Finally, we use pericentromeric polymorphisms to estimate centromeric genetic relatedness between individuals and find a strong predictive relationship between centromeric lineages and pericentromeric simple satellite abundances. In particular, ancestral monomers of Human Satellite 2 and Human Satellite 3 abundances correlate with clusters of centromeric ancestry on chromosome 16 and chromosome 9, with some clusters structured by population. These results provide new descriptions of the population dynamics that underlie the evolution of simple satellites in humans.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11305138 | PMC |
http://dx.doi.org/10.1093/gbe/evae153 | DOI Listing |
Diabetes
January 2025
Department of Biology & Institute of Biochemistry, Carleton University, Ottawa, ON, Canada.
Cancer survivors have an increased risk of developing Type 2 diabetes compared to the general population. Patients treated with cisplatin, a common chemotherapeutic agent, are more likely to develop metabolic syndrome and Type 2 diabetes than age- and sex-matched controls. Surprisingly, the impact of cisplatin on pancreatic islets has not been reported.
View Article and Find Full Text PDFJ Mol Neurosci
January 2025
Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China.
Hemorrhagic stroke is a known complication of glioma, yet the underlying mechanisms remain poorly understood. This study aims to investigate key biomarkers of glioma-related hemorrhage to provide insights into glioma molecular therapies. Data were obtained from the Gene Expression Omnibus (GEO) and the Cancer Genome Atlas (TCGA) databases to analyze differentially expressed genes (DEGs) in glioma by contrasting glioblastoma (GBM) with low-grade gliomas (LGGs).
View Article and Find Full Text PDFBackground: Familial hyperlipidemia (familial hypercholesterolemia, FH) is an autosomal genetic disorder. It includes type heterozygous familial hyperlipidemia (heterozygous familial hypercholesterolemia). HeFH is mainly caused by mutations in the LDLR, APOB, and PCSK9 genes and is characterized by elevated plasma low-density lipoprotein cholesterol levels.
View Article and Find Full Text PDFBrief Bioinform
November 2024
Departamento de Genética del Desarrollo y Fisiología Molecular, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, México.
This study addresses the challenging task of identifying viruses within metagenomic data, which encompasses a broad array of biological samples, including animal reservoirs, environmental sources, and the human body. Traditional methods for virus identification often face limitations due to the diversity and rapid evolution of viral genomes. In response, recent efforts have focused on leveraging artificial intelligence (AI) techniques to enhance accuracy and efficiency in virus detection.
View Article and Find Full Text PDFBrief Bioinform
November 2024
School of Artificial Intelligence, Jilin University, 3003 Qianjin Street, 130012 Changchun, China.
Accurate identification of causal genes for cancer prognosis is critical for estimating disease progression and guiding treatment interventions. In this study, we propose CPCG (Cancer Prognosis's Causal Gene), a two-stage framework identifying gene sets causally associated with patient prognosis across diverse cancer types using transcriptomic data. Initially, an ensemble approach models gene expression's impact on survival with parametric and semiparametric hazard models.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!