Several recent studies have presented evidence that the human gene catalogue should be expanded to include thousands of short open reading frames (ORFs) appearing upstream or downstream of existing protein-coding genes, each of which would comprise an additional bicistronic transcript in humans. Here we explore an alternative hypothesis that would explain the translational and evolutionary evidence for these upstream ORFs without the need to create novel genes or bicistronic transcripts. We examined 2,199 upstream ORFs that have been proposed as high-quality candidates for novel genes, to determine if they could instead represent protein-coding exons that can be added to existing genes. We checked for the conservation of these ORFs in four recently sequenced, high-quality human genomes, and found a large majority (87.8%) to be conserved in all four as expected. We then looked for splicing evidence that would connect each upstream ORF to the downstream protein-coding gene at the same locus, thus creating a novel splicing variant using the upstream ORF as its first exon. These protein coding exon candidates were further evaluated using protein structure predictions of the protein sequences that included the proposed new exons. We determined that 582 out of 2,199 upstream ORFs have strong evidence that they can form protein coding exons that are part of an existing gene, and that the resulting protein is predicted to have similar or better structural quality than the currently annotated isoform.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10983949 | PMC |
http://dx.doi.org/10.1101/2024.03.22.586333 | DOI Listing |
Gene
December 2024
Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, The Chinese Academy of Sciences, Guangzhou 510650, China. Electronic address:
Santalum album is an economically important plant in the craft, spices and medicine industries. The main chemical constituents found in sandalwood essential oils are sesquiterpenes. 3-Hydroxy-3-methylglutaryl monoacyl-coenzyme A reductase (HMGR) is one of the rate-limiting enzymes required for the synthesis of sandal sesquiterpenes, but there are no studies on the HMGR gene in S.
View Article and Find Full Text PDFMol Biol (Mosk)
December 2024
Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 119991 Russia.
Eukaryotic translation release factor eRF1 is an important cellular protein that plays a key role in translation termination, nonsense-mediated mRNA decay (NMD), and readthrough of stop codons. The amount of eRF1 in the cell influences all these processes. The mechanism of regulation of eRF1 translation through an autoregulatory NMD-dependent expression circuit has been described for plants and fungi, but the mechanisms of regulation of human eRF1 translation have not yet been studied.
View Article and Find Full Text PDFLarge-scale experimental analyses find ever more abundant evidence of translation from start codons upstream of the canonical start site. This translation either generates entirely new proteins (from novel upstream open reading frames) or produces isoforms with extended N-terminals when the novel start codon is in frame Most extended N-terminals are likely to just add a disordered region to the canonical protein isoform, but some may also block the recognition of the signal peptide causing the isoform to accumulate in the incorrect cellular compartment. This analysis finds evidence that upstream translations that would interfere with signal peptides are detected in expected quantities in ribosome profiling experiments, but that the equivalent N-terminally extended protein isoforms are significantly reduced in multiple proteomics experiments.
View Article and Find Full Text PDFbioRxiv
November 2024
Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain.
The human genome has been the subject of intense scrutiny by experimental and manual curation projects for more than two decades. Novel coding genes have been proposed from large-scale RNASeq, ribosome profiling and proteomics experiments. Here we carry out an in-depth analysis of an entire proteomics database.
View Article and Find Full Text PDFCells
November 2024
Intramural Research Program, National Institute on Aging, National Institutes of Health, 251 Bayview Blvd, Baltimore, MD 21224, USA.
Insulin resistance, stem cell dysfunction, and muscle fiber dystrophy are all age-related events in skeletal muscle (SKM). However, age-related changes in insulin isoforms and insulin receptors in myogenic progenitor satellite cells have not been studied. Since SKM is an extra-pancreatic tissue that does not express mature insulin, we investigated the levels of insulin receptors (INSRs) and a novel human insulin upstream open reading frame (INSU) at the mRNA, protein, and anatomical levels in Baltimore Longitudinal Study of Aging (BLSA) biopsied SKM samples of 27-89-year-old (yrs) participants.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!