The increase of publicly available bioactivity data in recent years has fueled and catalyzed research in chemogenomics, data mining, and modeling approaches. As a direct result, over the past few years a multitude of different methods have been reported and evaluated, such as target fishing, nearest neighbor similarity-based methods, and Quantitative Structure Activity Relationship (QSAR)-based protocols. However, such studies are typically conducted on different datasets, using different validation strategies, and different metrics. In this study, different methods were compared using one single standardized dataset obtained from ChEMBL, which is made available to the public, using standardized metrics (BEDROC and Matthews Correlation Coefficient). Specifically, the performance of Naïve Bayes, Random Forests, Support Vector Machines, Logistic Regression, and Deep Neural Networks was assessed using QSAR and proteochemometric (PCM) methods. All methods were validated using both a random split validation and a temporal validation, with the latter being a more realistic benchmark of expected prospective execution. Deep Neural Networks are the top performing classifiers, highlighting the added value of Deep Neural Networks over other more conventional methods. Moreover, the best method ('DNN_PCM') performed significantly better at almost one standard deviation higher than the mean performance. Furthermore, Multi-task and PCM implementations were shown to improve performance over single task Deep Neural Networks. Conversely, target prediction performed almost two standard deviations under the mean performance. Random Forests, Support Vector Machines, and Logistic Regression performed around mean performance. Finally, using an ensemble of DNNs, alongside additional tuning, enhanced the relative performance by another 27% (compared with unoptimized 'DNN_PCM'). Here, a standardized set to test and evaluate different machine learning algorithms in the context of multi-task learning is offered by providing the data and the protocols. Graphical Abstract .
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5555960 | PMC |
http://dx.doi.org/10.1186/s13321-017-0232-0 | DOI Listing |
J Hazard Mater
December 2024
Discipline of Chemistry, The University of Newcastle, University Drive, Newcastle, New South Whales 2308, Australia; School of Chemistry, Monash University, Wellington Road, Melbourne, Victoria 3800, Australia. Electronic address:
Microplastics are ubiquitous and appear to be harmful, however, the full extent to which these inflict harm has not been fully elucidated. Analysing environmental sample data is challenging, as the complexity in real data makes both automated and manual analysis either unreliable or time-consuming. To address challenges, we explored a dense feed-forward neural network (DNN) for classifying Fourier transform infrared (FTIR) spectroscopic data.
View Article and Find Full Text PDFMed Image Anal
January 2025
Department of Electrical and Computer Engineering, College of Information and Communication Engineering, Sungkyunkwan University, Suwon, 440-746, South Korea. Electronic address:
This study introduces HCC-Net, a novel wavelet-based approach for the accurate diagnosis of hepatocellular carcinoma (HCC) from abdominal ultrasound (US) images using artificial neural networks. The HCC-Net integrates the discrete wavelet transform (DWT) to decompose US images into four sub-band images, a lesion detector for hierarchical lesion localization, and a pattern-augmented classifier for generating pattern-enhanced lesion images and subsequent classification. The lesion detection uses a hierarchical coarse-to-fine approach to minimize missed lesions.
View Article and Find Full Text PDFNeural Netw
January 2025
Tsinghua University, Beijing, China. Electronic address:
Artificial neural networks (ANNs) can help camera-based remote photoplethysmography (rPPG) in measuring cardiac activity and physiological signals from facial videos, such as pulse wave, heart rate and respiration rate with better accuracy. However, most existing ANN-based methods require substantial computing resources, which poses challenges for effective deployment on mobile devices. Spiking neural networks (SNNs), on the other hand, hold immense potential for energy-efficient deep learning owing to their binary and event-driven architecture.
View Article and Find Full Text PDFNeural Netw
January 2025
College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, 150001, China. Electronic address:
Graph neural networks (GNNs) have shown great promise in modeling graph-structured data, but the over-smoothing problem restricts their effectiveness in deep layers. Two key weaknesses of existing research on deep GNN models are: (1) ignoring the beneficial aspects of intra-class smoothing while focusing solely on reducing inter-class smoothing, and (2) inefficient computation of residual weights that neglect the influence of neighboring nodes' distributions. To address these weaknesses, we propose a novel Smoothing Deceleration (SD) strategy to reduce the smoothing speed rate of nodes as information propagates between layers, thereby mitigating over-smoothing.
View Article and Find Full Text PDFInt J Med Inform
January 2025
University of Coimbra, Faculty of Medicine, Coimbra, Portugal; Department of Gastroenterology, Centro Hospitalar e Universitário de Coimbra, Coimbra, Portugal. Electronic address:
Background: The wireless capsule endoscope (CE) is a valuable diagnostic tool in gastroenterology, offering a safe and minimally invasive visualization of the gastrointestinal tract. One of the few drawbacks identified by the gastroenterology community is the time-consuming task of analyzing CE videos.
Objectives: This article investigates the feasibility of a computer-aided diagnostic method to speed up CE video analysis.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!