On the classification of a small imbalanced cytogenetic image database.

IEEE/ACM Trans Comput Biol Bioinform

Depsrtment of Elctrical and Computer Engineering, Ben-Gurion University, Beer-Sheva, Israel.

Published: July 2007

Solving a multiclass classification task using a small imbalanced database of patterns of high dimension is difficult due to the curse-of-dimensionality and the bias of the training toward the majority classes. Such a problem has arisen while diagnosing genetic abnormalities by classifying a small database of fluorescence in situ hybridization signals of types having different frequencies of occurrence. We propose and experimentally study using the cytogenetic domain two solutions to the problem. The first is hierarchical decomposition of the classification task, where each hierarchy level is designed to tackle a simpler problem which is represented by classes that are approximately balanced. The second solution is balancing the data by up-sampling the minority classes accompanied by dimensionality reduction. Implemented by the naive Bayesian classifier or the multilayer perceptron neural network, both solutions have diminished the problem and contributed to accuracy improvement. In addition, the experiments suggest that coping with the smallness of the data is more beneficial than dealing with its imbalance.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2007.070207DOI Listing

Publication Analysis

Top Keywords

small imbalanced
8
classification task
8
classification small
4
imbalanced cytogenetic
4
cytogenetic image
4
image database
4
database solving
4
solving multiclass
4
multiclass classification
4
task small
4

Similar Publications

Deep CNN ResNet-18 based model with attention and transfer learning for Alzheimer's disease detection.

Front Neuroinform

January 2025

Department of Computer Science and Engineering, Institute of Technology, Nirma University, Gujarat, India.

Introduction: The prevalence of age-related brain issues has risen in developed countries because of changes in lifestyle. Alzheimer's disease leads to a rapid and irreversible decline in cognitive abilities by damaging memory cells.

Methods: A ResNet-18-based system is proposed, integrating Depth Convolution with a Squeeze and Excitation (SE) block to minimize tuning parameters.

View Article and Find Full Text PDF

The ribotoxic stress response is a pathway that gets activated when ribosomes get impaired, leading to disruptions in protein synthesis, increased inflammatory signaling, and cell death if left unresolved. Taraxacum can induce apoptosis-associated ribosomal RNA (rRNA) cleavage, however, the exact working mechanism of Taraxacum-induced rRNA cleavage remains unclear. In this study, we used the RNA integrity (RIN) value and 28S/18S ratio to confirm the integrity of experiments.

View Article and Find Full Text PDF

Existing studies indicate that dysregulation or abnormal expression of small nucleolar RNA (snoRNA) is closely associated with various diseases, including lung cancer. Furthermore, these diseases often involve multiple targets, making the redevelopment of traditional medicines highly promising. Accurate prediction of potential snoRNA therapeutic targets is essential for early disease intervention and the redevelopment of traditional medicines.

View Article and Find Full Text PDF

Background: While automated methods for differential diagnosis of parkinsonian syndromes based on MRI imaging have been introduced, their implementation in clinical practice still underlies considerable challenges.

Objective: To assess whether the performance of classifiers based on imaging derived biomarkers is improved with the addition of basic clinical information and to provide a practical solution to address the insecurity of classification results due to the uncertain clinical diagnosis they are based on.

Methods: Retro- and prospectively collected data from multimodal MRI and standardized clinical datasets of 229 patients with PD (n = 167), PSP (n = 44), or MSA (n = 18) underwent multinomial classification in a benchmark study comparing the performance of nine machine learning methods.

View Article and Find Full Text PDF

One size does not fit all: revising traditional paradigms for assessing accuracy of QSAR models used for virtual screening.

J Cheminform

January 2025

National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, 9800 Medical Center Drive, Rockville, MD, 20850, USA.

Traditional best practices for quantitative structure activity relationship (QSAR) modeling recommend dataset balancing and balanced accuracy (BA) as the key desired objective of model development. This study explores the value of the conventional norms in the context of using QSAR models for virtual screening of modern large and ultra-large chemical libraries. For this increasingly common task, we now recommend the use of models with the highest positive predictive value (PPV) built on imbalanced training sets as preferred virtual screening tools.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!