Protein-protein interactions play essential roles in various biological progresses. Identifying protein interaction sites can facilitate researchers to understand life activities and therefore will be helpful for drug design. However, the number of experimental determined protein interaction sites is far less than that of protein sites in protein-protein interaction or protein complexes. Therefore, the negative and positive samples are usually imbalanced, which is common but bring result bias on the prediction of protein interaction sites by computational approaches. In this work, we presented three imbalance data processing strategies to reconstruct the original dataset, and then extracted protein features from the evolutionary conservation of amino acids to build a predictor for identification of protein interaction sites. On a dataset with 10,430 surface residues but only 2,299 interface residues, the imbalance dataset processing strategies can obviously reduce the prediction bias, and therefore improve the prediction performance of protein interaction sites. The experimental results show that our prediction models can achieve a better prediction performance, such as a prediction accuracy of 0.758, or a high F-measure of 0.737, which demonstrated the effectiveness of our method.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2019.2953908DOI Listing

Publication Analysis

Top Keywords

protein interaction
24
interaction sites
24
protein
9
imbalance data
8
data processing
8
processing strategies
8
prediction performance
8
interaction
7
sites
7
prediction
7

Similar Publications

The COVID-19 pandemic posed a threat to global society. Delta and Omicron are concerning variants due to the risk of increasing human-to-human transmissibility and immune evasion. This study aims to evaluate the binding ability of these variants toward the angiotensin-converting enzyme 2 receptor and antibodies using a computational approach.

View Article and Find Full Text PDF

The present study explores the conformational dynamics of the membrane protein of Middle East Respiratory Syndrome Coronavirus (MERS-CoV) within the Endoplasmic Reticulum-Golgi Intermediate Compartment (ERGIC) complex using an all-atomistic molecular dynamics simulation approach. Significant structural changes were observed in the N-terminal, C-terminal, transmembrane, and beta-sheet sandwich domains of the MERS-CoV membrane protein. This study also highlights the structural similarities between the MERS-CoV and the SARS-CoV-2 membrane proteins, particularly in how both exhibit a distinct kink in the transmembrane helix caused by aromatic residue-lipid interactions.

View Article and Find Full Text PDF

ZAR1/2-Regulated Epigenetic Modifications are Essential for Age-Associated Oocyte Quality Maintenance and Zygotic Activation.

Adv Sci (Weinh)

January 2025

Department of Obstetrics and Gynecology, Zhejiang Key Laboratory of Precise Protection and Promotion of Fertility, Zhejiang Provincial Clinical Research Center for Reproductive Health and Disease, Assisted Reproduction Unit, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, 310016, China.

The developmental competence and epigenetic progression of oocytes gradually become dysregulated with increasing maternal age. However, the mechanisms underlying age-related epigenetic regulation in oocytes remain poorly understood. Zygote arrest proteins 1 and 2 (ZAR1/2) are two maternal factors with partially redundant roles in maintaining oocyte quality, mainly known by regulating mRNA stability.

View Article and Find Full Text PDF

Circular RNAs in cancer: roles, mechanisms, and therapeutic potential across colorectal, gastric, liver, and lung carcinomas.

Discov Oncol

January 2025

Department of Bioscience and Biotechnology, Banasthali Vidyapith, Niwai-Tonk, Rajasthan, 304022, India.

The prominence of circular RNAs (circRNAs) has surged in cancer research due to their distinctive properties and impact on cancer development. This review delves into the role of circRNAs in four key cancer types: colorectal cancer (CRC), gastric cancer (GC), liver cancer (HCC), and lung cancer (LUAD). The focus lies on their potential as cancer biomarkers and drug targets.

View Article and Find Full Text PDF

Asthma is a complex disease with varied clinical manifestations resulting from the interaction between environmental and genetic factors. While chronic airway inflammation and hyperresponsiveness are central features, the etiology of asthma is multifaceted, leading to a diversity of phenotypes and endotypes. Although most research into the genetics of asthma focused on the analysis of single nucleotide polymorphisms (SNPs), studies highlight the importance of structural variations, such as copy number variations (CNVs), in the inheritance of complex characteristics, but their role has not yet been fully elucidated in asthma.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!