Pathogenic variation types in human genes relate to diseases through Pfam and InterPro mapping.

Front Mol Biosci

Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.

Published: September 2022

Grouping residue variations in a protein according to their physicochemical properties allows a dimensionality reduction of all the possible substitutions in a variant with respect to the wild type. Here, by using a large dataset of proteins with disease-related and benign variations, as derived by merging Humsavar and ClinVar data, we investigate to which extent our physicochemical grouping procedure can help in determining whether patterns of variation types are related to specific groups of diseases and whether they occur in Pfam and/or InterPro gene domains. Here, we download 75,145 germline disease-related and benign variations of 3,605 genes, group them according to physicochemical categories and map them into Pfam and InterPro gene domains. Statistically validated analysis indicates that each cluster of genes associated to Mondo anatomical system categorizations is characterized by a specific variation pattern. Patterns identify specific Pfam and InterPro domain-Mondo category associations. Our data suggest that the association of variation patterns to Mondo categories is unique and may help in associating gene variants to genetic diseases. This work corroborates in a much larger data set previous observations from our group.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9523224PMC
http://dx.doi.org/10.3389/fmolb.2022.966927DOI Listing

Publication Analysis

Top Keywords

pfam interpro
12
variation types
8
disease-related benign
8
benign variations
8
interpro gene
8
gene domains
8
pathogenic variation
4
types human
4
human genes
4
genes relate
4

Similar Publications

The 2025 Nucleic Acids Research database issue and the online molecular biology database collection.

Nucleic Acids Res

January 2025

IQVIA Ltd., The Point, 37 North Wharf Road, London W2 1AF, UK.

The 2025 Nucleic Acids Research database issue contains 185 papers spanning biology and related areas. Seventy three new databases are covered, while resources previously described in the issue account for 101 update articles. Databases most recently published elsewhere account for a further 11 papers.

View Article and Find Full Text PDF

The Pfam protein families database: embracing AI/ML.

Nucleic Acids Res

January 2025

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK.

The Pfam protein families database is a comprehensive collection of protein domains and families used for genome annotation and protein structure and function analysis (https://www.ebi.ac.

View Article and Find Full Text PDF

Genome guided, organ-specific transcriptome assembly of the European flounder (P. flesus) from the Baltic Sea.

Sci Data

October 2024

Department of Genetics and Marine Biotechnology, Institute of Oceanology, Polish Academy of Sciences, Powstańców Warszawy 55 Str., 81-712, Sopot, Poland.

Article Synopsis
  • The study focuses on the European flounder, highlighting the lack of comprehensive transcriptome data for this economically important species.
  • Researchers generated RNA-Seq data from ten organs of female flounders using Next Generation Sequencing, resulting in 500 million sequencing reads and 61,000 reliable contigs.
  • The assembled transcriptome, which demonstrates high completeness according to BUSCOs statistics, can be utilized in various research areas such as biology, aquaculture, and toxicology.
View Article and Find Full Text PDF

Humans benefit from a vast community of microorganisms in their gastrointestinal tract, known as the gut microbiota, numbering in the tens of trillions. An imbalance in the gut microbiota known as dysbiosis, can lead to changes in the metabolite profile, elevating the levels of toxins like toxin (BFT), colibactin, and cytolethal distending toxin. These toxins are implicated in the process of oncogenesis.

View Article and Find Full Text PDF

The dcGO Domain-Centric Ontology Database in 2023: New Website and Extended Annotations for Protein Structural Domains.

J Mol Biol

July 2023

Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine at Shanghai, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China. Electronic address:

Protein structural domains have been less studied than full-length proteins in terms of ontology annotations. The dcGO database has filled this gap by providing mappings from protein domains to ontologies. The dcGO update in 2023 extends annotations for protein domains of multiple definitions (SCOP, Pfam, and InterPro) with commonly used ontologies that are categorised into functions, phenotypes, diseases, drugs, pathways, regulators, and hallmarks.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!