This study presents a novel feature-engineered-natural gradient descent ensemble-boosting (NGBoost) machine-learning framework for detecting fraud in power consumption data. The proposed framework was sequentially executed in three stages: data pre-processing, feature engineering, and model evaluation. It utilized the random forest algorithm-based imputation technique initially to impute the missing data entries in the acquired smart meter dataset. In the second phase, the majority weighted minority oversampling technique (MWMOTE) algorithm was used to avoid an unequal distribution of data samples among different classes. The time-series feature-extraction library and whale optimization algorithm were utilized to extract and select the most relevant features from the kWh reading of consumers. Once the most relevant features were acquired, the model training and testing process was initiated by using the NGBoost algorithm to classify the consumers into two distinct categories ("Healthy" and "Theft"). Finally, each input feature's impact (positive or negative) in predicting the target variable was recognized with the tree SHAP additive-explanations algorithm. The proposed framework achieved an accuracy of 93%, recall of 91%, and precision of 95%, which was greater than all the competing models, and thus validated its efficacy and significance in the studied field of research.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8704372PMC
http://dx.doi.org/10.3390/s21248423DOI Listing

Publication Analysis

Top Keywords

machine-learning framework
8
power consumption
8
consumption data
8
proposed framework
8
relevant features
8
data
5
novel feature-engineered-ngboost
4
feature-engineered-ngboost machine-learning
4
framework
4
framework fraud
4

Similar Publications

Background: Machine learning models can reduce the burden on doctors by converting medical records into International Classification of Diseases (ICD) codes in real time, thereby enhancing the efficiency of diagnosis and treatment. However, it faces challenges such as small datasets, diverse writing styles, unstructured records, and the need for semimanual preprocessing. Existing approaches, such as naive Bayes, Word2Vec, and convolutional neural networks, have limitations in handling missing values and understanding the context of medical texts, leading to a high error rate.

View Article and Find Full Text PDF

Metabolism in vivo turns small molecules (e.g., drugs) into metabolites (new molecules), which brings unexpected safety issues in drug development.

View Article and Find Full Text PDF

Group IV Bimetallic MOFs Engineering Enhanced Metabolic Profiles Co-Predict Liposarcoma Recognition and Classification.

Small Methods

January 2025

Department of Gastroenterology and Hepatology, Zhongshan Hospital, Department of Chemistry, Department of Institutes of Biomedical Sciences, Fudan University, Shanghai, 200032, China.

The rarity and heterogeneity of liposarcomas (LPS) pose significant challenges in their diagnosis and management. In this work, a series of metal-organic frameworks (MOFs) engineering is designed and implemented. Through comprehensive characterization and performance evaluations, such as stability, thermal-driven desorption efficiency, as well as energy- and charge-transfer capacity, the engineering of group IV bimetallic MOFs emerges as particularly noteworthy.

View Article and Find Full Text PDF

Objectives: The continuous integration of artificial intelligence (AI) into clinical settings requires the development of up-to-date and robust guidelines and standard frameworks that consider the evolving challenges of AI implementation in medicine. This review evaluates the quality of these guideline and summarizes ethical frameworks, best practices, and recommendations.

Materials And Methods: The Appraisal of Guidelines, Research, and Evaluation II tool was used to assess the quality of guidelines based on 6 domains: scope and purpose, stakeholder involvement, rigor of development, clarity of presentation, applicability, and editorial independence.

View Article and Find Full Text PDF

Characterizing Architectural Glazing Performance for Circadian Light.

Energy Build

February 2025

Department of Architectural Engineering, Penn State University, University Park, PA, USA, 16803.

Growing research on the non-visual impacts of light underscores the importance of architectural glazing systems in managing transmitted shortwave solar light and shaping indoor circadian light, vital for enhancing well-being. This study, conducted in two phases, evaluates the effectiveness of existing window properties in predicting their contribution to circadian lighting. Initially, a decision tree analysis assessed these properties and revealed that although traditional glazing metrics are not entirely accurate for circadian performance estimations, they can still be effective when supplemented with specific thresholds as rapid tools for selecting windows optimized for circadian health.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!