Human protein-coding genes and gene feature statistics in 2019.

BMC Res Notes

Unit of Histology, Embryology and Applied Biology, Department of Experimental, Diagnostic and Specialty Medicine (DIMES), University of Bologna, Bologna, BO, Italy.

Published: June 2019

Objective: A well-known limit of genome browsers is that the large amount of genome and gene data is not organized in the form of a searchable database, hampering full management of numerical data and free calculations. Due to the continuous increase of data deposited in genomic repositories, their content revision and analysis is recommended. Using GeneBase, a software with a graphical interface able to import and elaborate National Center for Biotechnology Information (NCBI) Gene database entries, we provide tabulated spreadsheets updated to 2019 about human nuclear protein-coding gene data set ready to be used for any type of analysis about genes, transcripts and gene organization.

Results: Comparison with previous reports reveals substantial change in the number of known nuclear protein-coding genes (now 19,116), the protein-coding non-redundant transcriptome space [now 59,281,518 base pair (bp), 10.1% increase], the number of exons (now 562,164, 36.2% increase) due to a relevant increase of the RNA isoforms recorded. Other parameters such as gene, exon or intron mean and extreme length appear to have reached a stability that is unlikely to be substantially modified by human genome data updates, at least regarding protein-coding genes. Finally, we confirm that there are no human introns shorter than 30 bp.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6549324PMC
http://dx.doi.org/10.1186/s13104-019-4343-8DOI Listing

Publication Analysis

Top Keywords

protein-coding genes
12
gene data
8
nuclear protein-coding
8
gene
6
data
5
human
4
human protein-coding
4
genes
4
genes gene
4
gene feature
4

Similar Publications

Complete genome sequence of Pseudarthrobacter sp. NIBRBAC000502770 from coal mine of Hongcheon on Republic of Korea.

BMC Genom Data

January 2025

Department of Applied Biosciences, College of Agriculture and Life Sciences, Kyungpook National University, Daegu, 41566, Republic of Korea.

Objectives: The data were collected to obtain the complete genome sequence of Pseudarthrobacter sp. NIBRBAC000502770, isolated from the rhizosphere of Sasamorpha in a heavy metal-contaminated coal mine in Hongcheon, Republic of Korea. The objective was to explore the strain's genetic potential for plant growth promotion and heavy metal resistance, particularly arsenate and copper.

View Article and Find Full Text PDF

Genomic insights into a multidrug-resistant Pandoraea apista clinical isolate carrying bla from China.

J Glob Antimicrob Resist

January 2025

Clinical Laboratory Department, Lishui People's Hospital, the Sixth Affiliated Hospital of Wenzhou Medical University, Lishui, China. Electronic address:

Objectives: Pandoraea apista is notable for its multidrug resistance and is frequently identified in patients with cystic fibrosis or other chronic lung diseases, where it contributes to persistent lung infections. In this study, we describe a strain of P. apista harboring the bla, isolated from the bronchoalveolar lavage (BAL) fluid of an inpatient in China.

View Article and Find Full Text PDF

Most m5C modifications in mammalian mRNAs are nonadaptive.

Mol Biol Evol

January 2025

Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China.

5-methylation (m5C) on mRNA molecules is a prevalent internal posttranscriptional modification in eukaryotes. Although m5C modification has been reported to regulate some biological processes, it is unknown whether most mRNA m5C modifications are functional. To address this question, we analyzed the genome-wide evolutionary characteristics of m5C modifications in protein-coding genes of humans and mice.

View Article and Find Full Text PDF

Graph Neural Networks-Based Prediction of Drug Gene Interactions of RTK-VEGF4 Receptor Family in Periodontal Regeneration.

J Clin Exp Dent

December 2024

DDS. Titular Professor. Universidad de Antioquia U de A, Medellín, Colombia. Biomedical Stomatology Research Group, Universidad de Antioquia U de A, Medellín, Colombia.

Background: The RTK-VEGF4 receptor family, which includes VEGFR-1, VEGFR-2, and VEGFR-3, plays a crucial role in tissue regeneration by promoting angiogenesis, the formation of new blood vessels, and recruiting stem cells and immune cells. Machine learning, particularly graph neural networks (GNNs), has shown high accuracy in predicting these interactions. This study aims to predict drug-gene interactions of the RTK-VEGF4 receptor family in periodontal regeneration using graph neural networks.

View Article and Find Full Text PDF

We present a genome assembly from an individual male (Poplar Grey moth; Arthropoda; Insecta; Lepidoptera; Noctuidae). The genome sequence has a total length of 424.20 megabases.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!