In this paper we define the concept of the Machine Learning Morphism (MLM) as a fundamental building block to express operations performed in machine learning such as data preprocessing, feature extraction, and model training. Inspired by statistical learning, MLMs are morphisms whose parameters are minimized via a risk function. We explore operations such as composition of MLMs and when sets of MLMs form a vector space. These operations are used to build a machine learning workflow from data preprocessing to final task completion. We examine the Mapper Algorithm from Topological Data Analysis as an MLM, and build several workflows for binary classification incorporating Mapper on Hospital Readmissions and Credit Evaluation datasets. The advantage of this framework lies in the ability to easily build, organize, and compare multiple workflows, and allows joint optimization of parameters across multiple steps in an application.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6886815 | PMC |
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0225577 | PLOS |
HGG Adv
January 2025
Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
Inherited genetics represents an important contributor to risk of esophageal adenocarcinoma (EAC), and its precursor Barrett's esophagus (BE). Genome-wide association studies have identified ∼30 susceptibility variants for BE/EAC, yet genetic interactions remain unexamined. To address challenges in large-scale G×G scans, we combined knowledge-guided filtering and machine learning approaches, focusing on genes with (A) known/plausible links to BE/EAC pathogenesis (n=493) or (B) prior evidence of biological interactions (n=4,196).
View Article and Find Full Text PDFSci Rep
January 2025
Department of ECE, Kallam Haranadhareddy Institute of Technology, Guntur, Andhra Pradesh, India.
Cognitive load stimulates neural activity, essential for understanding the brain's response to stress-inducing stimuli or mental strain. This study examines the feasibility of evaluating cognitive load by extracting, selection, and classifying features from electroencephalogram (EEG) signals. We employed robust local mean decomposition (R-LMD) to decompose EEG data from each channel, recorded over a four-second period, into five modes.
View Article and Find Full Text PDFSci Rep
January 2025
Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.
We have adopted the classification Read-Across Structure-Activity Relationship (c-RASAR) approach in the present study for machine-learning (ML)-based model development from a recently reported curated dataset of nephrotoxicity potential of orally active drugs. We initially developed ML models using nine different algorithms separately on topological descriptors (referred to as simply "descriptors" in the subsequent sections of the manuscript) and MACCS fingerprints (referred to as "fingerprints" in the subsequent sections of the manuscript), thus generating 18 different ML QSAR models. Using the chemical spaces defined by the modeling descriptors and fingerprints, the similarity and error-based RASAR descriptors were computed, and the most discriminating RASAR descriptors were used to develop another set of 18 different ML c-RASAR models.
View Article and Find Full Text PDFSci Rep
January 2025
Crop and Horticultural Science Research Department, Mazandaran Agricultural Resources Research and Education Center, Agricultural Research, Education and Extension Organization (AREEO), Tajrish, Iran.
Plum fruit fresh weight (FW) estimation is crucial for various agricultural practices, including yield prediction, quality control, and market pricing. Traditional methods for estimating fruit weight are often destructive, time-consuming, and labor-intensive. In this study, we addressed the problem of predicting plum FW using artificial intelligence (AI) methods based on fruit dimensions.
View Article and Find Full Text PDFApoptosis
January 2025
Department of Pathology, Fudan University Shanghai Cancer Center, Shanghai, China.
Cancer-associated fibroblasts (CAFs) significantly influence tumor progression and therapeutic resistance in colorectal cancer (CRC). However, the distributions and functions of CAF subpopulations vary across the four consensus molecular subtypes (CMSs) of CRC. This study performed single-cell RNA and bulk RNA sequencing and revealed that myofibroblast-like CAFs (myCAFs), tumor-like CAFs (tCAFs), inflammatory CAFs (iCAFs), CXCL14CAFs, and MTCAFs are notably enriched in CMS4 compared with other CMSs of CRC.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!