To date, no systematic study has assessed the effect of random experimental errors on the predictive power of QSAR models. To address this shortage, we have benchmarked the noise sensitivity of 12 learning algorithms on 12 data sets (15,840 models in total), namely the following: Support Vector Machines (SVM) with radial and polynomial (Poly) kernels, Gaussian Process (GP) with radial and polynomial kernels, Relevant Vector Machines (radial kernel), Random Forest (RF), Gradient Boosting Machines (GBM), Bagged Regression Trees, Partial Least Squares, and k-Nearest Neighbors. Model performance on the test set was used as a proxy to monitor the relative noise sensitivity of these algorithms as a function of the level of simulated noise added to the bioactivities from the training set. The noise was simulated by sampling from Gaussian distributions with increasingly larger variances, which ranged from zero to the range of pIC50 values comprised in a given data set. General trends were identified by designing a full-factorial experiment, which was analyzed with a normal linear model. Overall, GBM displayed low noise tolerance, although its performance was comparable to RF, SVM Radial, SVM Poly, GP Poly, and GP Radial at low noise levels. Of practical relevance, we show that the bag fraction parameter has a marked influence on the noise sensitivity of GBM, suggesting that low values (e.g., 0.1-0.2) for this parameter should be set when modeling noisy data. The remaining 11 algorithms display a comparable noise tolerance, as a smooth and linear degradation of model performance is observed with the level of noise. However, SVM Poly and GP Poly display significant noise sensitivity at high noise levels in some cases. Overall, these results provide a practical guide to make informed decisions about which algorithm and parameter values to use according to the noise level present in the data.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1021/acs.jcim.5b00101 | DOI Listing |
Sci Rep
January 2025
Institute for System Dynamics, University of Stuttgart, Waldburgstr. 19, 70563, Stuttgart, Germany.
Including sensor information in medical interventions aims to support surgeons to decide on subsequent action steps by characterizing tissue intraoperatively. With bladder cancer, an important issue is tumor recurrence because of failure to remove the entire tumor. Impedance measurements can help to classify bladder tissue and give the surgeons an indication on how much tissue to remove.
View Article and Find Full Text PDFInt J Pharm
January 2025
Department of Chemical and Biochemical Engineering, Rutgers University, Piscataway, NJ, 08854, USA; Center for Structured Organic Particulate Systems (C-SOPS), Cranbury, NJ, 08512, USA.
This study used Raman and near-infrared (NIR) spectroscopy to monitor small real-time changes in powder blends and tablets in low-dose pharmaceutical formulations. The research aims to enhance process analytical technology (PAT) in pharmaceutical manufacturing, ensuring high-quality and uniform products with applications to produce drugs with narrow therapeutic indices (NTI). The study utilizes Raman and NIR spatially resolved spectroscopy (SRS) techniques to monitor a moderate cohesive material's active pharmaceutical ingredient (API) concentrations during manufacturing.
View Article and Find Full Text PDFAnal Biochem
January 2025
Key Laboratory of Green and Precise Synthetic Chemistry and Applications, Ministry of Education, Anhui Provincial Key Laboratory of Synthetic Chemistry and Applications, College of Chemistry and Materials Science, Huaibei Normal University, Huaibei, Anhui 235000, PR China. Electronic address:
Luminol-loaded mesoporous carbon nanospheres (MCs@LU) were utilized to develop a highly sensitive electrochemiluminescence (ECL) sensor for the detection of L-cysteine (L-Cys). L-Cys acted as the coreactant of luminol, and the pore confinement effect of mesoporous carbons (MCs) resulted in a robust ECL signal. Upon optimization, a linear correlation between the ECL intensity and L-Cys concentration was observed over the range of 5.
View Article and Find Full Text PDFPhys Med Biol
January 2025
National Institute of Radiological Sciences, 4-9-1 Anagawa, Inage-ku, Chiba, 263-8555, JAPAN.
PET has become an important clinical modality but is limited to imaging positron emitters. Recently, PET imaging withZr, which has a half-life of 3 days, has attracted much attention in immuno-PET to visualize immune cells and cancer cells by targeting specific antibodies on the cell surface. However,Zr emits a single gamma ray at 909 keV four times more frequently than positrons, causing image quality degradation in conventional PET.
View Article and Find Full Text PDFPhys Rev Lett
December 2024
Vienna Center for Quantum Science and Technology, Atominstitut, TU Wien, 1020 Vienna, Austria.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!