DNA-binding proteins actively participate in life activities such as DNA replication, recombination, gene expression and regulation and play a prominent role in these processes. As DNA-binding proteins continue to be discovered and increase, it is imperative to design an efficient and accurate identification tool. Considering the time-consuming and expensive traditional experimental technology and the insufficient number of samples in the biological computing method based on structural information, we proposed a machine learning algorithm based on sequence information to identify DNA binding proteins, named multi-view Least Squares Support Vector Machine via Hilbert-Schmidt Independence Criterion (multi-view LSSVM via HSIC). This method took 6 feature sets as multi-view input and trains a single view through the LSSVM algorithm. Then, we integrated HSIC into LSSVM as a regular term to reduce the dependence between views and explored the complementary information of multiple views. Subsequently, we trained and coordinated the submodels and finally combined the submodels in the form of weights to obtain the final prediction model. On training set PDB1075, the prediction results of our model were better than those of most existing methods. Independent tests are conducted on the datasets PDB186 and PDB2272. The accuracy of the prediction results was 85.5% and 79.36%, respectively. This result exceeded the current state-of-the-art methods, which showed that the multi-view LSSVM via HSIC can be used as an efficient predictor.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ymeth.2022.08.015DOI Listing

Publication Analysis

Top Keywords

dna-binding proteins
12
multi-view lssvm
12
independence criterion
8
lssvm hsic
8
prediction model
8
multi-view
5
lssvm
5
identification dna-binding
4
proteins
4
proteins multi-view
4

Similar Publications

In silico analysis of non-conventional gene targets for genetic interventions to enhance fatty acid production: a review.

Mol Biol Rep

January 2025

Department of Biological Sciences, Sunandan Divatia School of Science, NMIMS Deemed to be University, Vile Parle (West), Mumbai, 400056, India.

Since the 1990s, fatty acids (FA) have drawn significant industrial attention due to their diverse applications creating a demand for biological systems capable of producing high FA titers. While various strategies have been explored to achieve this, many of the conventional approaches rely on extensive genetic manipulations, which often result in strain instability, thus limiting its potential to yield better FA titers. Moreover, stresses such as pH, osmotic, and oxidative imbalances generated during FA production aggravate these challenges, further limiting FA titers.

View Article and Find Full Text PDF

Upregulation of the MAP2K4 gene triggers endothelial-mesenchymal transition in COVID-19.

Mol Biol Rep

January 2025

Department of Molecular Biology and Genetics, Faculty of Art and Science, Tokat Gaziosmanpasa University, Tokat, 60200, Türkiye.

Background: SARS-CoV-2 infection is marked by an excessive inflammatory response, leading to elevated production of pro-inflammatory cytokines through activation of intracellular pathways like mitogen-activated protein kinase (MAPK). Viruses can use the MAPK signaling pathway to their advantage, but the relationship of this pathway to the severe SARS-CoV-2 period has not been fully elucidated. MAP2K4 is involved in the MAPK signaling pathway and affects cellular processes such as cell-cell junction, cell proliferation, differentiation and apoptosis.

View Article and Find Full Text PDF

Mutations occurring in the MeCp2, CDKL5 and BDNF genes have been linked to epileptogenesis in various epilepsy syndromes. This study employed bioinformatics analysis of transcriptomic data to examine the interrelationship among these genes in both epileptic and healthy individuals. Moreover, we assessed the expression of MeCp2, CDKL5 and BDNF at both mRNA and protein levels in human hippocampal tissues obtained from 22 patients undergoing epilepsy surgery for mesial temporal lobe epilepsy (MTLE) as well as from 25 autopsied specimens.

View Article and Find Full Text PDF

Background: Idiopathic pulmonary fibrosis (IPF) is a fibrotic disease driven by both environmental and genetic factors. Epigenetics refers to changes in gene expression or cellular phenotype that do not involve alterations to DNA sequence. KMT2A is a member of the SET family which catalyses H3K4 methylation.

View Article and Find Full Text PDF

Mucoepidermoid carcinoma (MEC) is a subtype of epithelial neoplasms commonly found in salivary glands, but can also be seen in the thymus. Diagnosing MEC of the thymus is sometimes challenging due to its histological similarities with adenosquamous carcinoma (ASC). This case report describes a 64-year-old female with a history of metastatic endometrial adenocarcinoma who presented to an oncology clinic with a thymic mass as well as multiple mass lesions in the liver, bone, and abdominal wall.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!