Recent evidence from proteomics and deep massively parallel sequencing studies have revealed that eukaryotic genomes contain substantial numbers of as-yet-uncharacterized open reading frames (ORFs). We define these uncharacterized ORFs as novel ORFs (nORFs). nORFs in humans are mostly under 100 codons and are found in diverse regions of the genome, including in long noncoding RNAs, pseudogenes, 3' UTRs, 5' UTRs, and alternative reading frames of canonical protein coding exons. There is therefore a pressing need to evaluate the potential functional importance of these unannotated transcripts and proteins in biological pathways and human disease on a larger scale, rather than one at a time. In this study, we outline the creation of a valuable nORFs data set with experimental evidence of translation for the community, use measures of heritability and selection that reveal signals for functional importance, and show the potential implications for functional interpretation of genetic variants in nORFs. Our results indicate that some variants that were previously classified as being benign or of uncertain significance may have to be reinterpreted.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7849405PMC
http://dx.doi.org/10.1101/gr.263202.120DOI Listing

Publication Analysis

Top Keywords

reading frames
12
open reading
8
platform curated
4
curated products
4
products novel
4
novel open
4
frames prompts
4
prompts reinterpretation
4
reinterpretation disease
4
disease variants
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!