Classification of histogram-valued data with support histogram machines.

Ilsuk Kang Cheolwoo Park Young Joo Yoon Changyi Park Soon-Sun Kwon Hosik Choi

J Appl Stat

Graduate School, Department of Urban Big Data Convergence, University of Seoul, Seoul, The Republic of Korea.

Published: July 2021

The current large amounts of data and advanced technologies have produced new types of complex data, such as histogram-valued data. The paper focuses on classification problems when predictors are observed as or aggregated into histograms. Because conventional classification methods take vectors as input, a natural approach converts histograms into vector-valued data using summary values, such as the mean or median. However, this approach forgoes the distributional information available in histograms. To address this issue, we propose a margin-based classifier called support histogram machine (SHM) for histogram-valued data. We adopt the support vector machine framework and the Wasserstein-Kantorovich metric to measure distances between histograms. The proposed optimization problem is solved by a dual approach. We then test the proposed SHM via simulated and real examples and demonstrate its superior performance to summary-value-based methods.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9930853	PMC
http://dx.doi.org/10.1080/02664763.2021.1947996	DOI Listing

Publication Analysis

Top Keywords

histogram-valued data

support histogram

data

classification histogram-valued

data support

histogram machines

machines current

current large

large amounts

amounts data

Similar Publications

Classification of histogram-valued data with support histogram machines.

J Appl Stat

July 2021

Graduate School, Department of Urban Big Data Convergence, University of Seoul, Seoul, The Republic of Korea.

Ilsuk Kang Cheolwoo Park Young Joo Yoon Changyi Park Soon-Sun Kwon

View Article and Find Full Text PDF

Similar Publications

Convex clustering analysis for histogram-valued data.

Biometrics

June 2019

Department of Mathematics Education, Korea National University of Education, Cheongju, Chungbuk, 28173, Korea.

Cheolwoo Park Hosik Choi Chris Delcher Yanning Wang Young Joo Yoon

In recent years, there has been increased interest in symbolic data analysis, including for exploratory analysis, supervised and unsupervised learning, time series analysis, etc. Traditional statistical approaches that are designed to analyze single-valued data are not suitable because they cannot incorporate the additional information on data structure available in symbolic data, and thus new techniques have been proposed for symbolic data to bridge this gap. In this article, we develop a regularized convex clustering approach for grouping histogram-valued data.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!