Quantitative structure-activity relationship (QSAR) is an effective computational technique for drug design that relates the chemical structures of compounds to their biological activities. Feature selection is an important step in QSAR based drug design to select the most relevant descriptors. One of the most popular feature selection methods for classification problems is Fisher score which aim is to minimize the within-class distance and maximize the between-class distance. In this study, the properties of Fisher criterion were extended for QSAR models to define the new distance metrics based on the continuous activity values of compounds with known activities. Then, a semi-supervised feature selection method was proposed based on the combination of Fisher and Laplacian criteria which exploits both compounds with known and unknown activities to select the relevant descriptors. To demonstrate the efficiency of the proposed semi-supervised feature selection method in selecting the relevant descriptors, we applied the method and other feature selection methods on three QSAR data sets such as serine/threonine-protein kinase PLK3 inhibitors, ROCK inhibitors and phenol compounds. The results demonstrated that the QSAR models built on the selected descriptors by the proposed semi-supervised method have better performance than other models. This indicates the efficiency of the proposed method in selecting the relevant descriptors using the compounds with known and unknown activities. The results of this study showed that the compounds with known and unknown activities can be helpful to improve the performance of the combined Fisher and Laplacian based feature selection methods.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1007/s10822-017-0094-6 | DOI Listing |
JMIR Med Inform
January 2025
Department of Endocrinology and Metabolism, The First Affiliated Hospital, Jiangxi Medical College, Nanchang University, Nanchang, China.
Background: Many tools have been developed to predict the risk of diabetes in a population without diabetes; however, these tools have shortcomings that include the omission of race, inclusion of variables that are not readily available to patients, and low sensitivity or specificity.
Objective: We aimed to develop and validate an easy, systematic index for predicting diabetes risk in the Asian population.
Methods: We collected the data from the NAGALA (NAfld [nonalcoholic fatty liver disease] in the Gifu Area, Longitudinal Analysis) database.
Parasit Vectors
January 2025
Faculty of Information Technology, Mutah University, Mutah, Jordan.
Background: Amebiasis represents a significant global health concern. This is especially evident in developing countries, where infections are more common. The primary diagnostic method in laboratories involves the microscopy of stool samples.
View Article and Find Full Text PDFBMC Pregnancy Childbirth
January 2025
Department of Obstetrics and Gynecology, Division of Maternal-Fetal Medicine, University of Utah Health, 30 N. Mario Capecchi Dr., Level 5 South, Salt Lake City, UT, 84132, USA.
Background: Fetal growth restriction (FGR) is a leading risk factor for stillbirth, yet the diagnosis of FGR confers considerable prognostic uncertainty, as most infants with FGR do not experience any morbidity. Our objective was to use data from a large, deeply phenotyped observational obstetric cohort to develop a probabilistic graphical model (PGM), a type of "explainable artificial intelligence (AI)", as a potential framework to better understand how interrelated variables contribute to perinatal morbidity risk in FGR.
Methods: Using data from 9,558 pregnancies delivered at ≥ 20 weeks with available outcome data, we derived and validated a PGM using randomly selected sub-cohorts of 80% (n = 7645) and 20% (n = 1,912), respectively, to discriminate cases of FGR resulting in composite perinatal morbidity from those that did not.
BMC Public Health
January 2025
Department of Statistics and Data Science, Jahangirnagar University, Dhaka, 1342, Bangladesh.
Background: Child mortality is a reliable and significant indicator of a nation's health. Although the child mortality rate in Bangladesh is declining over time, it still needs to drop even more in order to meet the Sustainable Development Goals (SDGs). Machine Learning models are one of the best tools for making more accurate and efficient forecasts and gaining in-depth knowledge.
View Article and Find Full Text PDFBMC Plant Biol
January 2025
Plant Production Department, College of Food and Agricultural Sciences, King Saud University, P.O. Box. 2460, Riyadh, 11451, Saudi Arabia.
Background: The present research work was done to evaluate the anatomical differences among selected species of the family Bignoniaceae, as limited anatomical data is available for this family in Pakistan. Bignoniaceae is a remarkable family for its various medicinal properties and anatomical characterization is an important feature for the identification and classification of plants.
Methodology: In this study, several anatomical structures were examined, including stomata type and shape, leaf epidermis shape, epidermal cell size, and the presence or absence of trichomes and crystals (e.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!