Shallow Representation Learning via Kernel PCA Improves QSAR Modelability.

J Chem Inf Model

Department of Bioengineering, Stanford University , Shriram Center, Room 213, 443 Via Ortega MC 4245, Stanford, California 94305, United States.

Published: August 2017

Linear models offer a robust, flexible, and computationally efficient set of tools for modeling quantitative structure-activity relationships (QSARs) but have been eclipsed in performance by nonlinear methods. Support vector machines (SVMs) and neural networks are currently among the most popular and accurate QSAR methods because they learn new representations of the data that greatly improve modelability. In this work, we use shallow representation learning to improve the accuracy of L1 regularized logistic regression (LASSO) and meet the performance of Tanimoto SVM. We embedded chemical fingerprints in Euclidean space using Tanimoto (a.k.a. Jaccard) similarity kernel principal component analysis (KPCA) and compared the effects on LASSO and SVM model performance for predicting the binding activities of chemical compounds against 102 virtual screening targets. We observed similar performance and patterns of improvement for LASSO and SVM. We also empirically measured model training and cross-validation times to show that KPCA used in concert with LASSO classification is significantly faster than linear SVM over a wide range of training set sizes. Our work shows that powerful linear QSAR methods can match nonlinear methods and demonstrates a modular approach to nonlinear classification that greatly enhances QSAR model prototyping facility, flexibility, and transferability.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5942586PMC
http://dx.doi.org/10.1021/acs.jcim.6b00694DOI Listing

Publication Analysis

Top Keywords

shallow representation
8
representation learning
8
nonlinear methods
8
qsar methods
8
lasso svm
8
learning kernel
4
kernel pca
4
pca improves
4
qsar
4
improves qsar
4

Similar Publications

Background And Objectives: Hypertensive Retinopathy (HR) is a retinal manifestation resulting from persistently elevated blood pressure. Severity grading of HR is essential for patient risk stratification, effective management, progression monitoring, timely intervention, and minimizing the risk of vision impairment. Computer-aided diagnosis and artificial intelligence (AI) systems play vital roles in the diagnosis and grading of HR.

View Article and Find Full Text PDF

To address the challenges of missed detections caused by insufficient shape and texture features and blurred boundaries in existing detection methods, this paper introduces a novel moving vehicle detection approach for satellite videos. The proposed method leverages frame difference and convolution to effectively integrate spatiotemporal information. First, a frame difference module (FDM) is designed, combining frame difference and convolution.

View Article and Find Full Text PDF

Looking at the world often involves not just seeing things, but feeling things. Modern feedforward machine vision systems that learn to perceive the world in the absence of active physiology, deliberative thought, or any form of feedback that resembles human affective experience offer tools to demystify the relationship between seeing and feeling, and to assess how much of visually evoked affective experiences may be a straightforward function of representation learning over natural image statistics. In this work, we deploy a diverse sample of 180 state-of-the-art deep neural network models trained only on canonical computer vision tasks to predict human ratings of arousal, valence, and beauty for images from multiple categories (objects, faces, landscapes, art) across two datasets.

View Article and Find Full Text PDF

Investigating the intrinsic top-down dynamics of deep generative models.

Sci Rep

January 2025

Department of General Psychology and Padova Neuroscience Center, University of Padova, Padova, Italy.

Hierarchical generative models can produce data samples based on the statistical structure of their training distribution. This capability can be linked to current theories in computational neuroscience, which propose that spontaneous brain activity at rest is the manifestation of top-down dynamics of generative models detached from action-perception cycles. A popular class of hierarchical generative models is that of Deep Belief Networks (DBNs), which are energy-based deep learning architectures that can learn multiple levels of representations in a completely unsupervised way exploiting Hebbian-like learning mechanisms.

View Article and Find Full Text PDF

Quantification of denitrification rate in shallow groundwater using the single-well, push-pull test technique.

J Contam Hydrol

January 2025

Environmental Science, School of Agriculture and Environment, Massey University, Private Bag 11 222, Palmerston North 4442, New Zealand.

Denitrification has been identified as a significant nitrate attenuation process in groundwater systems. Hence, accurate quantification of denitrification rates is consequently important for the better understanding and assessment of nitrate contamination of groundwater systems. There are, however, few studies that have investigated quantification of shallow groundwater denitrification rates using different analytical approaches or assuming different kinetic reaction models.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!