Functional Site Discovery From Incomplete Training Data: A Case Study With Nucleic Acid-Binding Proteins.

Front Genet

SJTU-Yale Joint Center for Biostatistics and Data Science, Department of Bioinformatics and Biostatistics, College of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, Chinas.

Published: August 2019

Function annotation efforts provide a foundation to our understanding of cellular processes and the functioning of the living cell. This motivates high-throughput computational methods to characterize new protein members of a particular function. Research work has focused on discriminative machine-learning methods, which promise to make efficient, predictions of protein function. Furthermore, available function annotation exists predominantly for individual proteins rather than residues of which only a subset is necessary for the conveyance of a particular function. This limits discriminative approaches to predicting functions for which there is sufficient residue-level annotation, e.g., identification of DNA-binding proteins or where an excellent global representation can be divined. Complete understanding of the various functions of proteins requires discovery and functional annotation at the residue level. Herein, we cast this problem into the setting of multiple-instance learning, which only requires knowledge of the protein's function yet identifies functionally relevant residues and need not rely on homology. We developed a new multiple-instance leaning algorithm derived from AdaBoost and benchmarked this algorithm against two well-studied protein function prediction tasks: annotating proteins that bind DNA and RNA. This algorithm outperforms certain previous approaches in annotating protein function while identifying functionally relevant residues involved in binding both DNA and RNA, and on one protein-DNA benchmark, it achieves near perfect classification.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6729729PMC
http://dx.doi.org/10.3389/fgene.2019.00729DOI Listing

Publication Analysis

Top Keywords

protein function
12
function
8
function annotation
8
functionally relevant
8
relevant residues
8
dna rna
8
proteins
5
functional site
4
site discovery
4
discovery incomplete
4

Similar Publications

SARS-CoV-2 membrane protein induces neurodegeneration via affecting Golgi-mitochondria interaction.

Transl Neurodegener

December 2024

Department of Neurosciences, Hengyang Medical School, University of South China, Hengyang, 421009, China.

Background: Neurological complications are a significant concern of Coronavirus Disease 2019 (COVID-19). However, the pathogenic mechanism of neurological symptoms associated with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection is poorly understood.

Methods: We used Drosophila as a model to systematically analyze SARS-CoV-2 genes encoding structural and accessory proteins and identified the membrane protein (M) that disrupted mitochondrial functions in vivo.

View Article and Find Full Text PDF

Background: Angelman syndrome (AS), a severe neurodevelopmental disorder resulting from the loss of the maternal UBE3A gene, is marked by changes in the brain's white matter (WM). The extent of WM abnormalities seems to correlate with the severity of clinical symptoms, but these deficits are still poorly characterized or understood. This study provides the first large-scale measurement of WM volume reduction in children with AS.

View Article and Find Full Text PDF

Central nervous system (CNS) resident memory CD8 T cells (T) that express IFN-γ contribute to neurodegenerative processes, including synapse loss, leading to memory impairment. Here, we show that CCR2 signaling in CD8 T that persist within the hippocampus after recovery from CNS infection with West Nile virus (WNV) significantly prevents the development of memory impairments. Using CCR2-deficient mice, we determined that CCR2 expression is not essential for CNS T cell recruitment or virologic control during acute WNV infection.

View Article and Find Full Text PDF

Background: Universal lipid screening in childhood for early detection and treatment of familial hypercholesterolemia is under discussion, but will also detect children with multifactorial dyslipidemia. Results from population-based studies can support the design of public health strategies. As few previous studies considered pubertal changes in serum lipid levels, we examined tracking of serum lipids from prepuberty to young adulthood in a population-based cohort.

View Article and Find Full Text PDF

Competing endogenous RNAs network dysregulation in oral cancer: a multifaceted perspective on crosstalk and competition.

Cancer Cell Int

December 2024

Laboratory of Stem Cell Regulation with Chinese Medicine and Its Application, School of Pharmacy, Hunan University of Chinese Medicine, Changsha, Hunan, 410208, China.

Oral cancer progresses from asymptomatic to advanced stages, often involving cervical lymph node metastasis, resistance to chemotherapy, and an unfavorable prognosis. Clarifying its potential mechanisms is vital for developing effective theraputic strategies. Recent research suggests a substantial involvement of non-coding RNA (ncRNA) in the initiation and advancement of oral cancer.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!