The graph neural network (GNN) has been widely used for graph data representation. However, the existing researches only consider the ideal balanced dataset, and the imbalanced dataset is rarely considered. Traditional methods such as resampling, reweighting, and synthetic samples that deal with imbalanced datasets are no longer applicable in GNN. This study proposes an ensemble model called Boosting-GNN, which uses GNNs as the base classifiers during boosting. In Boosting-GNN, higher weights are set for the training samples that are not correctly classified by the previous classifiers, thus achieving higher classification accuracy and better reliability. Besides, transfer learning is used to reduce computational cost and increase fitting ability. Experimental results indicate that the proposed Boosting-GNN model achieves better performance than graph convolutional network (GCN), GraphSAGE, graph attention network (GAT), simplifying graph convolutional networks (SGC), multi-scale graph convolution networks (N-GCN), and most advanced reweighting and resampling methods on synthetic imbalanced datasets, with an average performance improvement of 4.5%.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8655128PMC
http://dx.doi.org/10.3389/fnbot.2021.775688DOI Listing

Publication Analysis

Top Keywords

imbalanced datasets
8
graph convolutional
8
graph
7
boosting-gnn
4
boosting-gnn boosting
4
boosting algorithm
4
algorithm graph
4
graph networks
4
imbalanced
4
networks imbalanced
4

Similar Publications

To cope with the challenges posed by high-concurrency training tasks involving large models and big data, Directed Acyclic Graph (DAG) and shard were proposed as alternatives to blockchain-based federated learning, aiming to enhance training concurrency. However, there is insufficient research on the specific consensus designs and the effects of varying shard sizes on federated learning. In this paper, we combine DAG and shard by designing three tip selection consensus algorithms and propose an adaptive algorithm to improve training performance.

View Article and Find Full Text PDF

In credit risk assessment, unsupervised classification techniques can be introduced to reduce human resource expenses and expedite decision-making. Despite the efficacy of unsupervised learning methods in handling unlabeled datasets, their performance remains limited owing to challenges such as imbalanced data, local optima, and parameter adjustment complexities. Thus, this paper introduces a novel hybrid unsupervised classification method, named the two-stage hybrid system with spectral clustering and semi-supervised support vector machine (TSC-SVM), which effectively addresses the unsupervised imbalance problem in credit risk assessment by targeting global optimal solutions.

View Article and Find Full Text PDF

Small proteins (≤100 amino acids) play important roles across all life forms, ranging from unicellular bacteria to higher organisms. In this study, we have developed SProtFP which is a machine learning-based method for functional annotation of prokaryotic small proteins into selected functional categories. SProtFP uses independent artificial neural networks (ANNs) trained using a combination of physicochemical descriptors for classifying small proteins into antitoxin type 2, bacteriocin, DNA-binding, metal-binding, ribosomal protein, RNA-binding, type 1 toxin and type 2 toxin proteins.

View Article and Find Full Text PDF

The class imbalance problem is one of the difficult factors affecting the performance of traditional classifiers. The oversampling technique is the most common way to solve the class imbalance problem. They alleviate the performance impact of the class imbalance problem on traditional machine learning by augmenting minority instance feature representation.

View Article and Find Full Text PDF

Applying deep learning to unsupervised bearing fault diagnosis in complex industrial environments is challenging. Traditional fault detection methods rely on labeled data, which is costly and labor-intensive to obtain. This paper proposes a novel unsupervised approach, WDCAE-LKA, combining a wide kernel convolutional autoencoder (WDCAE) with a large kernel attention (LKA) mechanism to improve fault detection under unlabeled conditions, and the adaptive threshold module based on a multi-layer perceptron (MLP) dynamically adjusts thresholds, boosting model robustness in imbalanced scenarios.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!