Protein hotspot residues are key sites that mediate protein-protein interactions. Accurate identification of these residues is essential for understanding the mechanism from protein to function and for designing drug targets. Current research has mostly focused on using machine learning methods to predict hot spots from known interface residues, which artificially extract the corresponding features of amino acid residues from sequence, structure, evolution, energy, and other information to train and test machine learning models. The process is cumbersome, time-consuming and laborious to some extent. This paper proposes a novel idea that develops a pre-trained protein sequence embedding model combined with a one-dimensional convolutional neural network, called Embed-1dCNN, to predict protein hotspot residues. In order to obtain large data samples, this work integrates and extracts data from the datasets of ASEdb, BID, SKEMPI and dbMPIKT to generate a new dataset, and adopts the SMOTE algorithm to expand positive samples to form the training set. The experimental results show that the method achieves an F1 score of 0.82 on the test set. Compared with other hot spot prediction methods, our model achieved better prediction performance.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10506709 | PMC |
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0290899 | PLOS |
PLoS One
January 2025
Information Technology Section, Changshu Center for Disease Control and Prevention, Changshu, Jiangsu, China.
Objective: This study aimed to enhance the prevention and control of pulmonary tuberculosis (PTB) and provide more effective and accurate methods in Changshu City.
Methods: The PTB patients' information came from the China Information System for Disease Control and Prevention (CISDCP). The demographic data for Changshu city and towns came from the Suzhou Statistical Yearbook and the LandScan platform.
World J Diabetes
January 2025
School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 100029, China.
Background: Epidemiological surveys indicate an increasing incidence of type 2 diabetes mellitus (T2DM) among children and adolescents worldwide. Due to rapid disease progression, severe long-term cardiorenal complications, a lack of effective treatment strategies, and substantial socioeconomic burdens, it has become an urgent public health issue that requires management and resolution. Adolescent T2DM differs from adult T2DM.
View Article and Find Full Text PDFPediatr Res
January 2025
Department of Neurology, Children's Hospital Affiliated to Capital Institute of Pediatrics, Beijing, China.
Background: CblC type methylmalonic aciduria (cblC disease) is the most common inborn error of vitamin B12 metabolism and due to mutations in the MMACHC gene. The earlier the diagnosis, the better the prognosis. Therefore, convenient and inexpensive detection method is needed.
View Article and Find Full Text PDFJ Venom Anim Toxins Incl Trop Dis
December 2024
Department of Tropical Medicine, Medical Microbiology and Pharmacology, John A. Burns School of Medicine, University of Hawai'i at Mānoa, Honolulu, Hawaii, United States.
Envenomation by aquatic species is an under-investigated source of human morbidity and mortality. Increasing population density along marine and freshwater coastlines increases these incidents. Specific occupational groups - including commercial fishery workers, fisherfolk, marine tourism workers, and researchers - rely on aquatic resources for their livelihood.
View Article and Find Full Text PDFGene
January 2025
ICAR-National Bureau of Animal Genetic Resources Karnal Haryana India. Electronic address:
In this study, whole genome sequence data of Ladakhi cattle from high altitude region of Leh-Ladakh and Sahiwal cattle from arid, semi-arid tropical region were compared. To gain a deeper understanding of the selective footprints in the genomes of Ladakhi and Sahiwal cattle, two strategies namely run of homozygosity (ROH), and fixation index (F) were employed. A total of 975 and 1189 ROH regions were identified in Ladakhi and Sahiwal cattle, respectively.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!