In high-dimensional quantitative structure-activity relationship (QSAR) modelling, penalization methods have been a popular choice to simultaneously address molecular descriptor selection and QSAR model estimation. In this study, a penalized linear regression model with L1/2-norm is proposed. Furthermore, the local linear approximation algorithm is utilized to avoid the non-convexity of the proposed method. The potential applicability of the proposed method is tested on several benchmark data sets. Compared with other commonly used penalized methods, the proposed method can not only obtain the best predictive ability, but also provide an easily interpretable QSAR model. In addition, it is noteworthy that the results obtained in terms of applicability domain and Y-randomization test provide an efficient and a robust QSAR model. It is evident from the results that the proposed method may possibly be a promising penalized method in the field of computational chemistry research, especially when the number of molecular descriptors exceeds the number of compounds.

Download full-text PDF

Source
http://dx.doi.org/10.1080/1062936X.2016.1228696DOI Listing

Publication Analysis

Top Keywords

proposed method
16
qsar model
12
qsar modelling
8
penalized linear
8
linear regression
8
regression model
8
model l1/2-norm
8
model
5
proposed
5
method
5

Similar Publications

The identification of neoantigens is crucial for advancing vaccines, diagnostics, and immunotherapies. Despite this importance, a fundamental question remains: how to model the presentation of neoantigens by major histocompatibility complex class I molecules and the recognition of the peptide-MHC-I (pMHC-I) complex by T cell receptors (TCRs). Accurate prediction of pMHC-I binding and TCR recognition remains a significant computational challenge in immunology due to intricate binding motifs and the long-tail distribution of known binding pairs in public databases.

View Article and Find Full Text PDF

A comprehensive benchmarking for evaluating TCR embeddings in modeling TCR-epitope interactions.

Brief Bioinform

November 2024

Department of Computer Science, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon Tong, Hong Kong, 999077, China.

The complexity of T cell receptor (TCR) sequences, particularly within the complementarity-determining region 3 (CDR3), requires efficient embedding methods for applying machine learning to immunology. While various TCR CDR3 embedding strategies have been proposed, the absence of their systematic evaluations created perplexity in the community. Here, we extracted CDR3 embedding models from 19 existing methods and benchmarked these models with four curated datasets by accessing their impact on the performance of TCR downstream tasks, including TCR-epitope binding affinity prediction, epitope-specific TCR identification, TCR clustering, and visualization analysis.

View Article and Find Full Text PDF

Background: Nasal high flow (NHF) has been proposed to sustain high intensity exercise in people with COPD, but we have a poor understanding of its physiological effects in this clinical setting.

Research Question: What is the effect of NHF during exercise on dynamic respiratory muscle function and activation, cardiorespiratory parameters, endurance capacity, dyspnoea and leg fatigue as compared to control intervention.

Study Design And Methods: Randomized single-blind crossover trial including COPD patients.

View Article and Find Full Text PDF

Risk assessment of persistent incidental pulmonary subsolid nodules to guide appropriate surveillance interval and endpoints.

Pulmonology

December 2025

Department of Diagnostic Radiology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.

Guidelines for the follow-up of pulmonary subsolid nodule (SSN) vary in terms of frequency and criteria for discontinuation. We aimed to evaluate the growth risk of SSNs and define appropriate follow-up intervals and endpoints. The immediate risk (IR) and cumulative risk (CR) of SSN growth were assessed using the Kaplan-Meier method according to nodule consistency and size.

View Article and Find Full Text PDF

Background: The implementation of large language models (LLMs), such as BART (Bidirectional and Auto-Regressive Transformers) and GPT-4, has revolutionized the extraction of insights from unstructured text. These advancements have expanded into health care, allowing analysis of social media for public health insights. However, the detection of drug discontinuation events (DDEs) remains underexplored.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!