Feature selection and classification model construction on type 2 diabetic patients' data.

Artif Intell Med

Department of Computing, Faculty of Engineering, Imperial College London, South Kensington, London SW7 2AZ, UK.

Published: November 2007

Objective: Diabetes affects between 2% and 4% of the global population (up to 10% in the over 65 age group), and its avoidance and effective treatment are undoubtedly crucial public health and health economics issues in the 21st century. The aim of this research was to identify significant factors influencing diabetes control, by applying feature selection to a working patient management system to assist with ranking, classification and knowledge discovery. The classification models can be used to determine individuals in the population with poor diabetes control status based on physiological and examination factors.

Methods: The diabetic patients' information was collected by Ulster Community and Hospitals Trust (UCHT) from year 2000 to 2004 as part of clinical management. In order to discover key predictors and latent knowledge, data mining techniques were applied. To improve computational efficiency, a feature selection technique, feature selection via supervised model construction (FSSMC), an optimisation of ReliefF, was used to rank the important attributes affecting diabetic control. After selecting suitable features, three complementary classification techniques (Naïve Bayes, IB1 and C4.5) were applied to the data to predict how well the patients' condition was controlled.

Results: FSSMC identified patients' 'age', 'diagnosis duration', the need for 'insulin treatment', 'random blood glucose' measurement and 'diet treatment' as the most important factors influencing blood glucose control. Using the reduced features, a best predictive accuracy of 95% and sensitivity of 98% was achieved. The influence of factors, such as 'type of care' delivered, the use of 'home monitoring', and the importance of 'smoking' on outcome can contribute to domain knowledge in diabetes control.

Conclusion: In the care of patients with diabetes, the more important factors identified: patients' 'age', 'diagnosis duration' and 'family history', are beyond the control of physicians. Treatment methods such as 'insulin', 'diet' and 'tablets' (a variety of oral medicines) may be controlled. However lifestyle indicators such as 'body mass index' and 'smoking status' are also important and may be controlled by the patient. This further underlines the need for public health education to aid awareness and prevention. More subtle data interactions need to be better understood and data mining can contribute to the clinical evidence base. The research confirms and to a lesser extent challenges current thinking. Whilst fully appreciating the requirement for clinical verification and interpretation, this work supports the use of data mining as an exploratory tool, particularly as the domain is suffering from a data explosion due to enhanced monitoring and the (potential) storage of this data in the electronic health record. FSSMC has proved a useful feature estimator for large data sets, where processing efficiency is an important factor.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.artmed.2007.07.002DOI Listing

Publication Analysis

Top Keywords

feature selection
16
data mining
12
data
9
model construction
8
diabetic patients'
8
public health
8
factors influencing
8
diabetes control
8
identified patients'
8
patients' 'age'
8

Similar Publications

Objective: The aim of this study was to develop and validate predictive models for perineural invasion (PNI) in gastric cancer (GC) using clinical factors and radiomics features derived from contrast-enhanced computed tomography (CE-CT) scans and to compare the performance of these models.

Methods: This study included 205 GC patients, who were randomly divided into a training set (n=143) and a validation set (n=62) in a 7:3 ratio. Optimal radiomics features were selected using the least absolute shrinkage and selection operator (LASSO) algorithm.

View Article and Find Full Text PDF

Background: Distinctive heterogeneity characterizes diffuse large B-cell lymphoma (DLBCL), one of the most frequent types of non-Hodgkin's lymphoma. Mitochondria have been demonstrated to be closely involved in tumorigenesis and progression, particularly in DLBCL.

Objective: The purposes of this study were to identify the prognostic mitochondria-related genes (MRGs) in DLBCL, and to develop a risk model based on MRGs and machine learning algorithms.

View Article and Find Full Text PDF

Microsaccade selectivity as discriminative feature for object decoding.

iScience

January 2025

School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran 14399-57131, Iran.

Microsaccades, a form of fixational eye movements, help maintain visual stability during stationary observations. This study examines the modulation of microsaccadic rates by various stimulus categories in monkeys and humans during a passive viewing task. Stimulus sets were grouped into four primary categories: human, animal, natural, and man-made.

View Article and Find Full Text PDF

Farming practices such as soil tillage, organic/mineral fertilization, irrigation, crop selection and residues management influence multiple ecosystem services provided by agricultural systems. These practices exhibit complex, non-linear interrelationships that affect crop productivity, water quality, and non-carbon dioxide greenhouse gases (GHG) emissions, possibly offsetting their benefits regarding soil organic carbon (SOC) sequestration. Current methodologies from the Intergovernmental Panel on Climate Change (IPCC) for assessing the impacts of alternative farming practices on GHG emissions rely on global or country-specific coefficients.

View Article and Find Full Text PDF

Electromyogram (EMG) has been a fundamental approach for prosthetic hand control. However it is limited by the functionality of residual muscles and muscle fatigue. Currently, exploring temporal shifts in brain networks and accurately classifying noninvasive electroencephalogram (EEG) for prosthetic hand control remains challenging.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!