Clustering aims at naturally grouping the data according to the underlying data distribution. The data distribution is often estimated using a parametric or nonparametric model, e.g., Gaussian mixture or kernel density estimation. Compared with nonparametric models, parametric models are statistically stable, i.e., a small perturbation of data points leads to a small change in the estimated density. However, parametric models are highly sensitive to outliers because the data distribution is far away from the parametric assumptions in the presence of outliers. Given a parametric clustering algorithm, this paper shows how to turn this algorithm into a robust one. The idea is to modify the original parametric density into a semiparametric one. The high-density data that form the core of each cluster are modeled with the original parametric density. The low-density data are often far away from the cluster cores and may have an arbitrary shape, thus are modeled using a nonparametric density. A combination of parametric and nonparametric clustering algorithms is used to group the data modeled as a semiparametric density. From the robust statistical point of view, the proposed method has good robustness properties. We test the proposed algorithm on several synthetic and 70 UCI data sets. The results indicate that the semiparametric method could significantly improve the clustering performance.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TNNLS.2018.2884790 | DOI Listing |
Environ Res
January 2025
Department of Civil, Environmental, & Architectural Engineering, Worcester Polytechnic Institute, Worcester, MA, United States. Electronic address:
The growing impact of climate change and escalating wildfire seasons has led to heightened ambient air pollution, potentially affecting children's sleep health. However, current epidemiological research often relies on outdoor weather data to model the environmental impacts on sleep health, potentially mischaracterizing the actual bedroom environment. To address these challenges, we conducted experiments to investigate the relationships among ambient, indoor, and personal exposure to PM concentrations and obstructive sleep apnea (OSA) in children.
View Article and Find Full Text PDFJ Med Imaging Radiat Sci
January 2025
Division of Library Services, Charles Sturt University, Albury, NSW, Australia.
Introduction/background: Group work plays a crucial role in healthcare education by fostering collaboration, communication, and teamwork skills. However, students often face challenges such as unequal workload distribution, conflict, and anxiety. Group learning contracts have been introduced to improve group dynamics by setting clear expectations, enhancing accountability, and promoting effective collaboration.
View Article and Find Full Text PDFEcon Hum Biol
January 2025
University of Perugia, Italy. Electronic address:
This paper examines internal migrant selection in Italy using individual height data from the 1951 and 1980 birth cohorts of military conscripts. Information on both place of birth and residence of conscripts allows us to compare migrants' heights to the height distributions of their non-migrant peers at the national level and to their populations of origin. Results suggest that migrants from southern Italy were negatively selected at the national level, while a positive selection in height emerged if compared to conscripts who remained in their macro-area of origin.
View Article and Find Full Text PDFComput Med Imaging Graph
December 2024
Nantes Université, Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France.
Diffuse Large B-cell Lymphoma (DLBCL) is a lymphatic cancer of steadily growing incidence. Its diagnostic and follow-up rely on the analysis of clinical biomarkers and 18F-Fluorodeoxyglucose (FDG)-PET/CT images. In this context, we target the problem of assisting in the early identification of high-risk DLBCL patients from both images and tabular clinical data.
View Article and Find Full Text PDFEpidemics
December 2024
Department of Biostatistics and Epidemiology, University of Massachusetts, Amherst, MA, United States.
Over the last ten years, the US Centers for Disease Control and Prevention (CDC) has organized an annual influenza forecasting challenge with the motivation that accurate probabilistic forecasts could improve situational awareness and yield more effective public health actions. Starting with the 2021/22 influenza season, the forecasting targets for this challenge have been based on hospital admissions reported in the CDC's National Healthcare Safety Network (NHSN) surveillance system. Reporting of influenza hospital admissions through NHSN began within the last few years, and as such only a limited amount of historical data are available for this target signal.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!