The selection of protein interaction documents is one important application for biology research and has a direct impact on the quality of downstream BioNLP applications, i.e., information extraction and retrieval, summarization, QA, etc. The BioCreative II.5 Challenge Article Categorization task (ACT) involves doing a binary text classification to determine whether a given structured full-text article contains protein interaction information. This may be the first attempt at classification of full-text protein interaction documents in wide community. In this paper, we compare and evaluate the effectiveness of different section types in full-text articles for text classification. Moreover, in practice, the less number of true-positive samples results in unstable performance and unreliable classifier trained on it. Previous research on learning with skewed class distributions has altered the class distribution using oversampling and downsampling. We also investigate the skewed protein interaction classification and analyze the effect of various issues related to the choice of external sources, oversampling training sets, classifiers, etc. We report on the various factors above to show that 1) a full-text biomedical article contains a wealth of scientific information important to users that may not be completely represented by abstracts and/or keywords, which improves the accuracy performance of classification and 2) reinforcing true-positive samples significantly increases the accuracy and stability performance of classification.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TCBB.2010.49 | DOI Listing |
Cell Rep
January 2025
State Key Laboratory of Natural Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing, Jiangsu, China. Electronic address:
Sterols target sterol-sensing domain (SSD) proteins to lower cholesterol and circulating and hepatic triglyceride levels, but the mechanism remains unclear. In this study, we identify acyl-coenzyme A (CoA) synthetase long-chain family member 1 (ACSL1) as a direct target of ergosterol (ES). The C-terminal domain of ACSL1 undergoes conformational changes from closed to open, and ES may target the drug-binding pocket in the acetyl-CoA synthetase-like domain 1 (ASLD1) of ACSL1 to stabilize the closed conformation and maintain its activity.
View Article and Find Full Text PDFCell Rep
January 2025
Department of Human Biology, Faculty of Natural Sciences, University of Haifa, Haifa 34988, Israel. Electronic address:
The elimination of superfluous neurons via apoptosis and subsequent glial phagocytosis is crucial for the development of the central nervous system (CNS). In Drosophila, two glial phagocytic receptors, six-microns-under (SIMU) and Draper, mediate the phagocytosis of apoptotic neurons during embryogenesis. However, in simu;draper double-mutant embryos, some apoptotic neurons are still engulfed by the glia, suggesting the involvement of additional receptors.
View Article and Find Full Text PDFGlycobiology
January 2025
Department of Biochemistry, Dorothy Crowfoot Hodgkin Building, University of Oxford, South Parks Road, OX1 3QU, United Kingdom.
Abdala is a COVID-19 vaccine produced in Pichia pastoris and is based on the receptor-binding domain (RBD) of the SARS-CoV-2 spike. Abdala is currently approved for use in multiple countries with clinical trials confirming its safety and efficacy in preventing severe illness and death. Although P.
View Article and Find Full Text PDFArch Microbiol
January 2025
Department of Botany, CMS College Kottayam, Kottayam, Kerala, 686001, India.
Among all photosynthetic life forms, cyanobacteria exclusively possess a water-soluble, light-sensitive carotenoprotein complex known as orange carotenoid proteins (OCPs), crucial for their photoprotective mechanisms. These protein complexes exhibit both structural and functional modularity, with distinct C-terminal (CTD) and N-terminal domains (NTD) serving as light-responsive sensor and effector regions, respectively. The majority of cyanobacterial genomes contain genes for OCP homologs and related proteins, highlighting their essential role in survival of the organism over time.
View Article and Find Full Text PDFCell Mol Biol (Noisy-le-grand)
January 2025
Department of Pharmacology, Faculty of Pharmacy, Mersin University, Mersin, Türkiye.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!