An original bioinformatics technique is developed to identify the protein-coding genes in rodents, lagomorphs and nonhuman primates that are pseudogenized in humans. The method is based on per-gene verification of local synteny, similarity of exon-intronic structures and orthology in a set of genomes. It is applicable to any genome set, even with the number of genomes exceeding 100, and efficiently implemented using fast computer software. Only 50 evolutionary recent human pseudogenes were predicted. Their functional homologs in model species are often associated with the immune system or digestion and mainly express in the testes. According to current evidence, knockout of most of these genes leads to an abnormal phenotype. Some genes were pseudogenized or lost independently in human and nonhuman hominoids.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7555810PMC
http://dx.doi.org/10.3390/life10090192DOI Listing

Publication Analysis

Top Keywords

protein-coding genes
8
genes euarchontoglires
4
euarchontoglires pseudogene
4
pseudogene homologs
4
homologs humans
4
humans original
4
original bioinformatics
4
bioinformatics technique
4
technique developed
4
developed identify
4

Similar Publications

Comprehensive analysis of the multi-rings mitochondrial genome of Populus tomentosa.

BMC Genomics

January 2025

State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, China.

Background: Populus tomentosa, known as Chinese white poplar, is indigenous and distributed across large areas of China, where it plays multiple important roles in forestry, agriculture, conservation, and urban horticulture. However, limited accessibility to the mitochondrial (mt) genome of P. tomentosa impedes phylogenetic and population genetic analyses and restricts functional gene research in Salicaceae family.

View Article and Find Full Text PDF

Evaluating the accuracy of protein-coding sequences in genome annotations is a challenging problem for which there is no broadly applicable solution. In this manuscript, we introduce PSAURON (Protein Sequence Assessment Using a Reference ORF Network), a novel software tool developed to help assess the quality of protein-coding gene annotations. Utilizing a machine learning model trained on a diverse dataset from over 1000 plant and animal genomes, PSAURON assigns a score to coding DNA or protein sequence that reflects the likelihood that the sequence is a genuine protein-coding region.

View Article and Find Full Text PDF

Under changing climatic conditions, plant exposure to high-intensity UV-B can be a potential threat to plant health and all plant-derived human requirements, including food. It's crucial to understand how plants respond to high UV-B radiation so that proper measures can be taken to enhance tolerance towards high UV-B stress. We found that BBX22, a B-box protein-coding gene, is strongly induced within one hour of exposure to high-intensity UV-B.

View Article and Find Full Text PDF

Cerasus is a subgenus of Prunus in the family Rosaceae that is popular owing to its ornamental, edible, and medicinal properties. Understanding the evolution of the Cerasus subgenus and identifying selective trait loci in edible cherries are crucial for the improvement of cherry cultivars to meet producer and consumer demands. In this study, we performed a de novo assembly of a chromosome-scale genome for the sweet cherry (Prunus avium L.

View Article and Find Full Text PDF

Segmental duplications (SDs) contribute significantly to human disease, evolution and diversity but have been difficult to resolve at the sequence level. We present a population genetics survey of SDs by analyzing 170 human genome assemblies (from 85 samples representing 38 Africans and 47 non-Africans) in which the majority of autosomal SDs are fully resolved using long-read sequence assembly. Excluding the acrocentric short arms and sex chromosomes, we identify 173.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!