IEEE J Biomed Health Inform
July 2024
Efficient medical image segmentation aims to provide accurate pixel-wise predictions with a lightweight implementation framework. However, existing lightweight networks generally overlook the generalizability of the cross-domain medical segmentation tasks. In this paper, we propose Generalizable Knowledge Distillation (GKD), a novel framework for enhancing the performance of lightweight networks on cross-domain medical segmentation by generalizable knowledge distillation from powerful teacher networks.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2024
IEEE Trans Pattern Anal Mach Intell
December 2024
IEEE Trans Pattern Anal Mach Intell
January 2024
Human faces contain rich semantic information that could hardly be described without a large vocabulary and complex sentence patterns. However, most existing text-to-image synthesis methods could only generate meaningful results based on limited sentence templates with words contained in the training set, which heavily impairs the generalization ability of these models. In this paper, we define a novel 'free-style' text-to-face generation and manipulation problem, and propose an effective solution, named AnyFace++, which is applicable to a much wider range of open-world scenarios.
View Article and Find Full Text PDFFacial Attribute Manipulation (FAM) aims to aesthetically modify a given face image to render desired attributes, which has received significant attention due to its broad practical applications ranging from digital entertainment to biometric forensics. In the last decade, with the remarkable success of Generative Adversarial Networks (GANs) in synthesizing realistic images, numerous GAN-based models have been proposed to solve FAM with various problem formulation approaches and guiding information representations. This paper presents a comprehensive survey of GAN-based FAM methods with a focus on summarizing their principal motivations and technical details.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2023
Occlusion is a common problem with biometric recognition in the wild. The generalization ability of CNNs greatly decreases due to the adverse effects of various occlusions. To this end, we propose a novel unified framework integrating the merits of both CNNs and graph models to overcome occlusion problems in biometric recognition, called multiscale dynamic graph representation (MS-DGR).
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
October 2023
We present PyMAF-X, a regression-based approach to recovering a parametric full-body model from a single image. This task is very challenging since minor parametric deviation may lead to noticeable misalignment between the estimated mesh and the input image. Moreover, when integrating part-specific estimations into the full-body model, existing solutions tend to either degrade the alignment or produce unnatural wrist poses.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
August 2024
Clinical management and accurate disease diagnosis are evolving from qualitative stage to the quantitative stage, particularly at the cellular level. However, the manual process of histopathological analysis is lab-intensive and time-consuming. Meanwhile, the accuracy is limited by the experience of the pathologist.
View Article and Find Full Text PDFIEEE Trans Image Process
December 2022
Recent studies of video action recognition can be classified into two categories: the appearance-based methods and the pose-based methods. The appearance-based methods generally cannot model temporal dynamics of large motion well by virtue of optical flow estimation, while the pose-based methods ignore the visual context information such as typical scenes and objects, which are also important cues for action understanding. In this paper, we tackle these problems by proposing a Pose-Appearance Relational Network (PARNet), which models the correlation between human pose and image appearance, and combines the benefits of these two modalities to improve the robustness towards unconstrained real-world videos.
View Article and Find Full Text PDFIEEE Trans Med Imaging
April 2023
With the development of deep convolutional neural networks, medical image segmentation has achieved a series of breakthroughs in recent years. However, high-performance convolutional neural networks always mean numerous parameters and high computation costs, which will hinder the applications in resource-limited medical scenarios. Meanwhile, the scarceness of large-scale annotated medical image datasets further impedes the application of high-performance networks.
View Article and Find Full Text PDFIEEE Trans Image Process
July 2022
One major issue that challenges person re-identification (Re-ID) is the ubiquitous occlusion over the captured persons. There are two main challenges for the occluded person Re-ID problem, i.e.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
May 2022
Reconstructing 3D human shape and pose from monocular images is challenging despite the promising results achieved by the most recent learning-based methods. The commonly occurred misalignment comes from the facts that the mapping from images to the model space is highly non-linear and the rotation-based pose representation of the body model is prone to result in the drift of joint positions. In this work, we investigate learning 3D human shape and pose from dense correspondences of body parts and propose a Decompose-and-aggregate Network (DaNet) to address these issues.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
May 2020
Near infrared-visible (NIR-VIS) heterogeneous face recognition refers to the process of matching NIR to VIS face images. Current heterogeneous methods try to extend VIS face recognition methods to the NIR spectrum by synthesizing VIS images from NIR images. However, due to the self-occlusion and sensing gap, NIR face images lose some visible lighting contents so that they are always incomplete compared to VIS face images.
View Article and Find Full Text PDFIEEE Trans Image Process
September 2019
Binocular stereo vision (SV) has been widely used to reconstruct the depth information, but it is quite vulnerable to scenes with strong occlusions. As an emerging computational photography technology, light-field (LF) imaging brings about a novel solution to passive depth perception by recording multiple angular views in a single exposure. In this paper, we explore binocular SV and LF imaging to form the binocular-LF imaging system.
View Article and Find Full Text PDFIEEE Trans Image Process
April 2019
Regression based methods have revolutionized 2D landmark localization with the exploitation of deep neural networks and massive annotated datasets in the wild. However, it remains challenging for 3D landmark localization due to the lack of annotated datasets and the ambiguous nature of landmarks under 3D perspective. This paper revisits regression based methods and proposes an adversarial voxel and coordinate regression framework for 2D and 3D facial landmark localization in real-world scenarios.
View Article and Find Full Text PDFIEEE Trans Image Process
December 2018
Hashing has attracted increasing attention due to its tremendous potential for efficient image retrieval and data storage. Compared with conventional hashing methods with a handcrafted feature, emerging deep hashing approaches employ deep neural networks to learn feature representations as well as hash functions, which have already been proved to be more powerful and robust in real-world applications. Currently, most of the existing deep hashing methods construct pairwise or triplet-wise constraint to obtain similar binary codes between similar data pair or relative similar binary codes within a triplet.
View Article and Find Full Text PDFPartial face recognition (PFR) in an unconstrained environment is a very important task, especially in situations where partial face images are likely to be captured due to occlusions, out-of-view, and large viewing angle, e.g., video surveillance and mobile devices.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
July 2019
Heterogeneous face recognition (HFR) aims at matching facial images acquired from different sensing modalities with mission-critical applications in forensics, security and commercial sectors. However, HFR presents more challenging issues than traditional face recognition because of the large intra-class variation among heterogeneous face images and the limited availability of training samples of cross-modality face image pairs. This paper proposes the novel Wasserstein convolutional neural network (WCNN) approach for learning invariant features between near-infrared (NIR) and visual (VIS) face images (i.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
May 2019
Unsupervised domain adaptation aims to leverage the labeled source data to learn with the unlabeled target data. Previous trandusctive methods tackle it by iteratively seeking a low-dimensional projection to extract the invariant features and obtaining the pseudo target labels via building a classifier on source data. However, they merely concentrate on minimizing the cross-domain distribution divergence, while ignoring the intra-domain structure especially for the target domain.
View Article and Find Full Text PDFIEEE Trans Image Process
September 2018
The low spatial resolution of light-field image poses significant difficulties in exploiting its advantage. To mitigate the dependency of accurate depth or disparity information as priors for light-field image super-resolution, we propose an implicitly multi-scale fusion scheme to accumulate contextual information from multiple scales for super-resolution reconstruction. The implicitly multi-scale fusion scheme is then incorporated into bidirectional recurrent convolutional neural network, which aims to iteratively model spatial relations between horizontally or vertically adjacent sub-aperture images of light-field data.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
July 2017
Feature selection (FS) is an important component of many pattern recognition tasks. In these tasks, one is often confronted with very high-dimensional data. FS algorithms are designed to identify the relevant feature subset from the original features, which can facilitate subsequent analysis, such as clustering and classification.
View Article and Find Full Text PDFLearning-based hashing algorithms are "hot topics" because they can greatly increase the scale at which existing methods operate. In this paper, we propose a new learning-based hashing method called "fast supervised discrete hashing" (FSDH) based on "supervised discrete hashing" (SDH). Regressing the training examples (or hash code) to the corresponding class labels is widely used in ordinary least squares regression.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
February 2018
Biometrics is the technique of automatically recognizing individuals based on their biological or behavioral characteristics. Various biometric traits have been introduced and widely investigated, including fingerprint, iris, face, voice, palmprint, gait and so forth. Apart from identity, biometric data may convey various other personal information, covering affect, age, gender, race, accent, handedness, height, weight, etc.
View Article and Find Full Text PDFData-dependent hashing has recently attracted attention due to being able to support efficient retrieval and storage of high-dimensional data, such as documents, images, and videos. In this paper, we propose a novel learning-based hashing method called "supervised discrete hashing with relaxation" (SDHR) based on "supervised discrete hashing" (SDH). SDH uses ordinary least squares regression and traditional zero-one matrix encoding of class label information as the regression target (code words), thus fixing the regression target.
View Article and Find Full Text PDFThis paper addresses the problem of grouping the data points sampled from a union of multiple subspaces in the presence of outliers. Information theoretic objective functions are proposed to combine structured low-rank representations (LRRs) to capture the global structure of data and information theoretic measures to handle outliers. In theoretical part, we point out that group sparsity-induced measures ( l -norm, l -norm, and correntropy) can be justified from the viewpoint of half-quadratic (HQ) optimization, which facilitates both convergence study and algorithmic development.
View Article and Find Full Text PDF