Outlier detection in spatial error models using modified thresholding-based iterative procedure for outlier detection approach.

BMC Med Res Methodol

Department of Epidemiology and Biostatistics, School of Public Health, Xi'an Jiaotong University Health Science Center, No. 76, Yanta Xilu Road, Xi'an, 710061, Shaanxi, China.

Published: April 2024

Background: Outliers, data points that significantly deviate from the norm, can have a substantial impact on statistical inference and provide valuable insights in data analysis. Multiple methods have been developed for outlier detection, however, almost all available approaches fail to consider the spatial dependence and heterogeneity in spatial data. Spatial data has diverse formats and semantics, requiring specialized outlier detection methodology to handle these unique properties. For now, there is limited research exists on robust spatial outlier detection methods designed specifically under the spatial error model (SEM) structure.

Method: We propose the Spatial-Θ-Iterative Procedure for Outlier Detection (Spatial-Θ-IPOD), which utilizes a mean-shift vector to identify outliers within the SEM. Our method enables an effective detection of spatial outliers while also providing robust coefficient estimates. To assess the performance of our approach, we conducted extensive simulations and applied it to a real-world empirical study using life expectancy data from multiple countries.

Results: Simulation results showed that the masking and JD (Joint Detection) indicators of our Spatial-Θ-IPOD method outperformed several commonly used methods, even in high-dimensional scenarios, demonstrating stable performance. Conversely, the Θ-IPOD method proved to be ineffective in detecting outliers when spatial correlation was present. Moreover, our model successfully provided reliable coefficient estimation alongside outlier detection. The proposed method consistently outperformed other models (both robust and non-robust) in most cases. In the empirical study, our proposed model successfully detected outliers and provided valuable insights in the modeling process.

Conclusions: Our proposed Spatial-Θ-IPOD offers an effective solution for detecting spatial outliers for SEM while providing robust coefficient estimates. Notably, our approach showcases its relative superiority even in the presence of high leverage points. By successfully identifying outliers, our method enhances the overall understanding of the data and provides valuable insights for further analysis.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11323683PMC
http://dx.doi.org/10.1186/s12874-024-02208-3DOI Listing

Publication Analysis

Top Keywords

outlier detection
28
valuable insights
12
spatial
9
detection spatial
8
spatial error
8
procedure outlier
8
detection
8
spatial data
8
outliers sem
8
spatial outliers
8

Similar Publications

Next-generation diabetes diagnosis and personalized diet-activity management: A hybrid ensemble paradigm.

PLoS One

January 2025

Department of Information Systems, College of Computer Sciences and Information Technology (CCSIT), King Faisal University, Al-Ahsa, Kingdom of Saudi Arabia.

Diabetes, a chronic metabolic condition characterised by persistently high blood sugar levels, necessitates early detection to mitigate its risks. Inadequate dietary choices can contribute to various health complications, emphasising the importance of personalised nutrition interventions. However, real-time selection of diets tailored to individual nutritional needs is challenging because of the intricate nature of foods and the abundance of dietary sources.

View Article and Find Full Text PDF

The bcc coating of Lennard-Jones crystal nuclei vanishes with a change of local structure detection algorithm.

J Chem Phys

January 2025

Soft Condensed Matter & Biophysics, Debye Institute for Nanomaterials Science, Utrecht University, Princetonplein 1, 3584 CC Utrecht, The Netherlands.

Since the influential work of ten Wolde, Ruiz-Montero, and Frenkel [Phys. Rev. Lett.

View Article and Find Full Text PDF

Major depressive disorder (MDD) and myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) frequently occur together; yet their causal relationship remains unclear. To investigate the potential genetic causal link between these conditions, we conducted a two-sample Mendelian randomization (MR) analysis. Summary data from Genome-Wide Association Studies (GWAS) for MDD were sourced from the UK Biobank and the Psychiatric Genomics Consortium, while GWAS data for ME/CFS were retrieved from the UK Biobank.

View Article and Find Full Text PDF

In this study, we introduce a novel approach that integrates interpretability techniques from both traditional machine learning (ML) and deep neural networks (DNN) to quantify feature importance using global and local interpretation methods. Our method bridges the gap between interpretable ML models and powerful deep learning (DL) architectures, providing comprehensive insights into the key drivers behind model predictions, especially in detecting outliers within medical data. We applied this method to analyze COVID-19 pandemic data from 2020, yielding intriguing insights.

View Article and Find Full Text PDF

Background/objectives: Host cell protein (HCP) content is a major attribute for biological and vaccine products that must be extensively characterized prior to product licensure. Enzyme Linked Immunosorbent Assay (ELISA) and Mass Spectrometry (MS) are conventional methods for quantitative host cell protein analysis in biologic and vaccine products. Both techniques are usually very tedious, labor-intensive, and challenging to transfer to other laboratories.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!