AI Article Synopsis

  • Regression models are crucial for analyzing or predicting quantitative outcomes based on various independent variables, but automated methods often struggle with integrating domain knowledge and identifying complex relationships.
  • This paper introduces a new framework that enhances regression model building by combining qualitative visual analyses with quantitative relevance rankings of features, allowing for a clearer understanding of variable interactions.
  • The framework's interactive workflow assists in tasks like validation and comparison, demonstrated through a case study on natural gas consumption, which received positive feedback from experts in the energy sector for improving the model-building process.

Article Abstract

Regression models play a key role in many application domains for analyzing or predicting a quantitative dependent variable based on one or more independent variables. Automated approaches for building regression models are typically limited with respect to incorporating domain knowledge in the process of selecting input variables (also known as feature subset selection). Other limitations include the identification of local structures, transformations, and interactions between variables. The contribution of this paper is a framework for building regression models addressing these limitations. The framework combines a qualitative analysis of relationship structures by visualization and a quantification of relevance for ranking any number of features and pairs of features which may be categorical or continuous. A central aspect is the local approximation of the conditional target distribution by partitioning 1D and 2D feature domains into disjoint regions. This enables a visual investigation of local patterns and largely avoids structural assumptions for the quantitative ranking. We describe how the framework supports different tasks in model building (e.g., validation and comparison), and we present an interactive workflow for feature subset selection. A real-world case study illustrates the step-wise identification of a five-dimensional model for natural gas consumption. We also report feedback from domain experts after two months of deployment in the energy sector, indicating a significant effort reduction for building and improving regression models.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TVCG.2013.125DOI Listing

Publication Analysis

Top Keywords

regression models
20
framework building
8
building regression
8
feature subset
8
subset selection
8
building
5
regression
5
models
5
partition-based framework
4
building validating
4

Similar Publications

Background: Few studies have explored the relationship between macronutrient intake and sleep outcomes using daily data from mobile apps.

Objective: This cross-sectional study aimed to examine the associations between macronutrients, dietary components, and sleep parameters, considering their interdependencies.

Methods: We analyzed data from 4825 users of the Pokémon Sleep and Asken smartphone apps, each used for at least 7 days to record objective sleep parameters and dietary components, respectively.

View Article and Find Full Text PDF

Background: Primary intracranial germ cell tumors (iGCTs) are highly malignant brain tumors that predominantly occur in children and adolescents, with an incidence rate ranking third among primary brain tumors in East Asia (8%-15%). Due to their insidious onset and impact on critical functional areas of the brain, these tumors often result in irreversible abnormalities in growth and development, as well as cognitive and motor impairments in affected children. Therefore, early diagnosis through advanced screening techniques is vital for improving patient outcomes and quality of life.

View Article and Find Full Text PDF

Background: Gastrointestinal bleeding (GIB) is a severe and potentially life-threatening complication in patients with acute myocardial infarction (AMI), significantly affecting prognosis during hospitalization. Early identification of high-risk patients is essential to reduce complications, improve outcomes, and guide clinical decision-making.

Objective: This study aimed to develop and validate a machine learning (ML)-based model for predicting in-hospital GIB in patients with AMI, identify key risk factors, and evaluate the clinical applicability of the model for risk stratification and decision support.

View Article and Find Full Text PDF

Purpose: To analyze survival and its predictors among patients with hepatocellular carcinoma (HCC) receiving transarterial chemoembolization (TACE) in Ethiopia.

Materials And Methods: We conducted a retrospective cohort study among patients who received TACE for HCC at MCM Hospital from December 1, 2016, to December 31, 2022. Data were extracted from patients' medical records, and vital status was ascertained from the patients' charts or by phone call to the next of kin.

View Article and Find Full Text PDF

Purpose: To assess trial-level surrogacy value for overall survival (OS) of the pathologic complete response (pCR) and invasive disease-free survival (iDFS) in randomized clinical trials (RCTs) for early breast cancer (BC).

Methods: Individual patient data of neoadjuvant RCTs with available data on pCR, iDFS, and OS were included in the analysis. We used the coefficient of determination from weighted linear regression models to quantify the association between treatment effects on OS and on the surrogate end points.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!