The spatial reconstruction of single-cell RNA sequencing (scRNA-seq) data into spatial transcriptomics (ST) is a rapidly evolving field that addresses the significant challenge of aligning gene expression profiles to their spatial origins within tissues. This task is complicated by the inherent batch effects and the need for precise gene expression characterization to accurately reflect spatial information. To address these challenges, we developed SELF-Former, a transformer-based framework that utilizes multi-scale structures to learn gene representations, while designing spatial correlation constraints for the reconstruction of corresponding ST data.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
August 2024
Ensemble learning improves the capability of convolutional neural network (CNN)-based discriminators, whose performance is crucial to the quality of generated samples in generative adversarial network (GAN). However, this learning strategy results in a significant increase in the number of parameters along with computational overhead. Meanwhile, the suitable number of discriminators required to enhance GAN performance is still being investigated.
View Article and Find Full Text PDFA main limitation of bulk transcriptomic technologies is that individual measurements normally contain contributions from multiple cell populations, impeding the identification of cellular heterogeneity within diseased tissues. To extract cellular insights from existing large cohorts of bulk transcriptomic data, we present CSsingle, a novel method designed to accurately deconvolve bulk data into a predefined set of cell types using a scRNA-seq reference. Through comprehensive benchmark evaluations and analyses using diverse real data sets, we reveal the systematic bias inherent in existing methods, stemming from differences in cell size or library size.
View Article and Find Full Text PDFIEEE J Biomed Health Inform
June 2024
As one of the effective ways of ocular disease recognition, early fundus screening can help patients avoid unrecoverable blindness. Although deep learning is powerful for image-based ocular disease recognition, the performance mainly benefits from a large number of labeled data. For ocular disease, data collection and annotation in a single site usually take a lot of time.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
August 2024
In this work, we study a more realistic challenging scenario in multiview clustering (MVC), referred to as incomplete MVC (IMVC) where some instances in certain views are missing. The key to IMVC is how to adequately exploit complementary and consistency information under the incompleteness of data. However, most existing methods address the incompleteness problem at the instance level and they require sufficient information to perform data recovery.
View Article and Find Full Text PDFIEEE Trans Biomed Eng
January 2023
Advances of high throughput experimental methods have led to the availability of more diverse omic datasets in clinical analysis applications. Different types of omic data reveal different cellular aspects and contribute to the understanding of disease progression from these aspects. While survival prediction and subgroup identification are two important research problems in clinical analysis, their performance can be further boosted by taking advantages of multiple omics data through multi-view learning.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
February 2022
Recently, multitask learning has been successfully applied to survival analysis problems. A critical challenge in real-world survival analysis tasks is that not all instances and tasks are equally learnable. A survival analysis model can be improved when considering the complexities of instances and tasks during the model training.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
January 2022
Graph-based methods have achieved impressive performance on semisupervised classification (SSC). Traditional graph-based methods have two main drawbacks. First, the graph is predefined before training a classifier, which does not leverage the interactions between the classifier training and similarity matrix learning.
View Article and Find Full Text PDFEnsemble learning has many successful applications because of its effectiveness in boosting the predictive performance of classification models. In this article, we propose a semisupervised multiple choice learning (SemiMCL) approach to jointly train a network ensemble on partially labeled data. Our model mainly focuses on improving a labeled data assignment among the constituent networks and exploiting unlabeled data to capture domain-specific information, such that semisupervised classification can be effectively facilitated.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
August 2021
Semisupervised clustering methods improve performance by randomly selecting pairwise constraints, which may lead to redundancy and instability. In this context, active clustering is proposed to maximize the efficacy of annotations by effectively using pairwise constraints. However, existing methods lack an overall consideration of the querying criteria and repeatedly run semisupervised clustering to update labels.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
April 2022
Identifying cancer subtypes by integration of multi-omic data is beneficial to improve the understanding of disease progression, and provides more precise treatment for patients. Cancer subtypes identification is usually accomplished by clustering patients with unsupervised learning approaches. Thus, most existing integrative cancer subtyping methods are performed in an entirely unsupervised way.
View Article and Find Full Text PDFIEEE Trans Image Process
October 2019
In this paper, we explore how to leverage readily available unlabeled data to improve semi-supervised human detection performance. For this purpose, we specifically modify the region proposal network (RPN) for learning on a partially labeled dataset. Based on commonly observed false positive types, a verification module is developed to assess foreground human objects in the candidate regions to provide an important cue for filtering the RPN's proposals.
View Article and Find Full Text PDFIEEE Trans Image Process
August 2019
Using an ensemble of neural networks with consistency regularization is effective for improving performance and stability of deep learning, compared to the case of a single network. In this paper, we present a semi-supervised Deep Coupled Ensemble (DCE) model, which contributes to ensemble learning and classification landmark exploration for better locating the final decision boundaries in the learnt latent space. First, multiple complementary consistency regularizations are integrated into our DCE model to enable the ensemble members to learn from each other and themselves, such that training experience from different sources can be shared and utilized during training.
View Article and Find Full Text PDFThe class imbalance problem has become a leading challenge. Although conventional imbalance learning methods are proposed to tackle this problem, they have some limitations: 1) undersampling methods suffer from losing important information and 2) cost-sensitive methods are sensitive to outliers and noise. To address these issues, we propose a hybrid optimal ensemble classifier framework that combines density-based undersampling and cost-effective methods through exploring state-of-the-art solutions using multi-objective optimization algorithm.
View Article and Find Full Text PDFHealthcare question answering (HQA) system plays a vital role in encouraging patients to inquire for professional consultation. However, there are some challenging factors in learning and representing the question corpus of HQA datasets, such as high dimensionality, sparseness, noise, nonprofessional expression, etc. To address these issues, we propose an inception convolutional autoencoder model for Chinese healthcare question clustering (ICAHC).
View Article and Find Full Text PDFClustering ensemble (CE) takes multiple clustering solutions into consideration in order to effectively improve the accuracy and robustness of the final result. To reduce redundancy as well as noise, a CE selection (CES) step is added to further enhance performance. Quality and diversity are two important metrics of CES.
View Article and Find Full Text PDFMultitask feature selection (MTFS) methods have become more important for many real world applications, especially in a high-dimensional setting. The most widely used assumption is that all tasks share the same features, and the l regularization method is usually applied. However, this assumption may not hold when the correlations among tasks are not obvious.
View Article and Find Full Text PDFIEEE Trans Cybern
November 2019
In the era of data science, a huge amount of data has emerged in the form of tensors. In many applications, the collected tensor data are incomplete with missing entries, which affects the analysis process. In this paper, we investigate a new method for tensor completion, in which a low-rank tensor approximation is used to exploit the global structure of data, and sparse coding is used for elucidating the local patterns of data.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
February 2019
Multitask clustering methods are proposed to improve performances of related tasks concurrently, because they explore the relationship among tasks via exploiting the coefficient matrix or the shared feature matrix. However, divergent effects of features in learning this relationship are seldom considered. To further improve performances, we propose a new multitask clustering approach through exploring correlations among tasks, clusters, and features based on effects of features on clusters.
View Article and Find Full Text PDFClassification of high-dimensional data with very limited labels is a challenging task in the field of data mining and machine learning. In this paper, we propose the multiobjective semisupervised classifier ensemble (MOSSCE) approach to address this challenge. Specifically, a multiobjective subspace selection process (MOSSP) in MOSSCE is first designed to generate the optimal combination of feature subspaces.
View Article and Find Full Text PDFIEEE Trans Cybern
February 2019
Traditional ensemble learning approaches explore the feature space and the sample space, respectively, which will prevent them to construct more powerful learning models for noisy real-world dataset classification. The random subspace method only search for the selection of features. Meanwhile, the bagging approach only search for the selection of samples.
View Article and Find Full Text PDFIEEE Trans Image Process
March 2018
The difference between sample distributions of public data sets and specific scenes can be very significant. As a result, the deployment of generic human detectors in real-world scenes most often leads to sub-optimal detection performance. To avoid the labor-intensive task of manual annotations, we propose a semi-supervised approach for training deep convolutional networks on partially labeled data.
View Article and Find Full Text PDFIEEE Trans Cybern
February 2019
High dimensional data classification with very limited labeled training data is a challenging task in the area of data mining. In order to tackle this task, we first propose a feature selection-based semi-supervised classifier ensemble framework (FSCE) to perform high dimensional data classification. Then, we design an adaptive semi-supervised classifier ensemble framework (ASCE) to improve the performance of FSCE.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
August 2019
In gene expression data analysis, the problems of cancer classification and gene selection are closely related. Successfully selecting informative genes will significantly improve the classification performance. To identify informative genes from a large number of candidate genes, various methods have been proposed.
View Article and Find Full Text PDF