Publications by authors named "Shih-Wen Ke"

Background: In practice, the collected datasets for data analysis are usually incomplete as some data contain missing attribute values. Many related works focus on constructing specific models to produce estimations to replace the missing values, to make the original incomplete datasets become complete. Another type of solution is to directly handle the incomplete datasets without missing value imputation, with decision trees being the major technique for this purpose.

View Article and Find Full Text PDF

Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques.

View Article and Find Full Text PDF

Introduction: K-nearest neighbor (k-NN) classification is conventional non-parametric classifier, which has been used as the baseline classifier in many pattern classification problems. It is based on measuring the distances between the test data and each of the training data to decide the final classification output.

Case Description: Since the Euclidean distance function is the most widely used distance metric in k-NN, no study examines the classification performance of k-NN by different distance functions, especially for various medical domain problems.

View Article and Find Full Text PDF

Introduction: More and more universities are receiving accreditation from the Association to Advance Collegiate Schools of Business (AACSB), which is an international association for promoting quality teaching and learning at business schools. To be accredited, the schools are required to meet a number of standards ensuring that certain levels of teaching quality and students' learning are met. However, there are a variety of points of view espoused in the literature regarding the relationship between research and teaching, some studies have demonstrated that research and teaching these are complementary elements of learning, but others disagree with these findings.

View Article and Find Full Text PDF

Background: To collect medical datasets, it is usually the case that a number of data samples contain some missing values. Performing the data mining task over the incomplete datasets is a difficult problem. In general, missing value imputation can be approached, which aims at providing estimations for missing values by reasoning from the observed data.

View Article and Find Full Text PDF

Background: The size of medical datasets is usually very large, which directly affects the computational cost of the data mining process. Instance selection is a data preprocessing step in the knowledge discovery process, which can be employed to reduce storage requirements while also maintaining the mining quality. This process aims to filter out outliers (or noisy data) from a given (training) dataset.

View Article and Find Full Text PDF