AI Article Synopsis

  • The GENCODE Consortium is focused on identifying all gene features in the human genome by using computational methods, manual annotation, and experimental confirmation.
  • The latest GENCODE 7 release includes 20,687 protein-coding and 9,640 long noncoding RNA loci, with many annotations for alternative splicing and a large number of new long noncoding RNA models.
  • Analysis of the data shows that a significant portion of transcriptional start sites and protein-coding genes have supporting evidence, and new RNA-seq data has identified 3,689 potential new loci, many of which may be long noncoding RNAs.

Article Abstract

The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431492PMC
http://dx.doi.org/10.1101/gr.135350.111DOI Listing

Publication Analysis

Top Keywords

long noncoding
12
human genome
8
noncoding rna
8
protein-coding genes
8
gencode
6
annotation
5
loci
5
gencode reference
4
reference human
4
genome annotation
4

Similar Publications

Prognostic value and immune landscapes of disulfidptosis‑related lncRNAs in bladder cancer.

Mol Clin Oncol

February 2025

Department of Urology Surgery, The Second Affiliated Hospital, Jiangxi Medical College, Nanchang University, Nanchang, Jiangxi 330006, P.R. China.

Disulfidptosis, which was recently identified, has shown promise as a potential cancer treatment. Nonetheless, the precise role of long non-coding RNAs (lncRNAs) in this phenomenon is currently unclear. To elucidate their significance in bladder cancer (BLCA), a signature of disulfidptosis-related lncRNAs (DRlncRNAs) was developed and their potential prognostic significance was explored.

View Article and Find Full Text PDF

Inefficient control of elevated blood sugar levels can lead to certain health complications such as diabetic nephropathy (DN) and cardiovascular disease (CVD). The identification of effective biomarkers for monitoring diabetes was performed in the present study. The present study aimed to investigate the implications of long non-coding RNA megacluster (lnc-MGC), microRNA (miR)-132 and miR-133a, and their correlation with lactate dehydrogenase (LDH) activity and glycated hemoglobin (HbA1C) levels to identify biomarkers for the early diagnosis of diabetes mellitus, induced DN and CVD.

View Article and Find Full Text PDF

Biting midges ( spp.) are important vectors of several insect borne arboviruses but are underrepresented in terms of availability of high-resolution genomic resources. We assembled and annotated complete mitochondrial genomes for two species, namely and which are proven vectors for Bluetongue Virus (BTV).

View Article and Find Full Text PDF

Purpose Of The Review: This review aims to explore the pivotal role of long non-coding RNAs (lncRNAs) as epigenetic regulators in the pathogenesis of multiple myeloma (MM). Additionally, we have portrayed the dual role of lncRNAs in the epigenetic landscape of MM pathobiology.

Recent Findings: In MM, lncRNAs are pivotal for proliferation, progression, and drug resistance by acting as miRNA sponges, regulating mRNA activity through microRNA recognition elements (MREs).

View Article and Find Full Text PDF

Background: Recent studies have highlighted the potential role of several long non-coding RNAs (lncRNAs) in the pathogenesis of Behçet's disease (BD). This study investigated the expression profiles of lncRNA NEAT1 and lncRNA HOTAIR, and their target cytokine genes, IL-6 and TNF-α, in active and inactive BD patients.

Methods: This cross-sectional study was conducted on peripheral blood mononuclear cells (PBMCs) obtained from 25 BD patients and 25 age-sex-matched healthy controls (HCs).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!