Publications by Liang-Sian Lin

Publications by authors named "Liang-Sian Lin"

Page 1 of 1

Improved support vector machine classification for imbalanced medical datasets by novel hybrid sampling combining modified mega-trend-diffusion and bagging extreme learning machine model.

Liang-Sian Lin Chen-Huan Kao Yi-Jie Li Hao-Hsuan Chen Hung-Yu Chen

Math Biosci Eng

September 2023

To handle imbalanced datasets in machine learning or deep learning models, some studies suggest sampling techniques to generate virtual examples of minority classes to improve the models' prediction accuracy. However, for kernel-based support vector machines (SVM), some sampling methods suggest generating synthetic examples in an original data space rather than in a high-dimensional feature space. This may be ineffective in improving SVM classification for imbalanced datasets.

View Article and Find Full Text PDF

A new approach to generating virtual samples to enhance classification accuracy with small data-a case of bladder cancer.

Liang-Sian Lin Susan C Hu Yao-San Lin Der-Chiang Li Liang-Ren Siao

Math Biosci Eng

April 2022

In the medical field, researchers are often unable to obtain the sufficient samples in a short period of time necessary to build a stable data-driven forecasting model used to classify a new disease. To address the problem of small data learning, many studies have demonstrated that generating virtual samples intended to augment the amount of training data is an effective approach, as it helps to improve forecasting models with small datasets. One of the most popular methods used in these studies is the mega-trend-diffusion (MTD) technique, which is widely used in various fields.

View Article and Find Full Text PDF

A Boundary-Information-Based Oversampling Approach to Improve Learning Performance for Imbalanced Datasets.

Der-Chiang Li Qi-Shi Shi Yao-San Lin Liang-Sian Lin

Entropy (Basel)

February 2022

Oversampling is the most popular data preprocessing technique. It makes traditional classifiers available for learning from imbalanced data. Through an overall review of oversampling techniques (oversamplers), we find that some of them can be regarded as danger-information-based oversamplers (DIBOs) that create samples near danger areas to make it possible for these positive examples to be correctly classified, and others are safe-information-based oversamplers (SIBOs) that create samples near safe areas to increase the correct rate of predicted positive values.

View Article and Find Full Text PDF

Detecting representative data and generating synthetic samples to improve learning accuracy with imbalanced data sets.

Der-Chiang Li Susan C Hu Liang-Sian Lin Chun-Wu Yeh

PLoS One

October 2017

It is difficult for learning models to achieve high classification performances with imbalanced data sets, because with imbalanced data sets, when one of the classes is much larger than the others, most machine learning and data mining classifiers are overly influenced by the larger classes and ignore the smaller ones. As a result, the classification algorithms often have poor learning performances due to slow convergence in the smaller classes. To balance such data sets, this paper presents a strategy that involves reducing the sizes of the majority data and generating synthetic samples for the minority data.

View Article and Find Full Text PDF