Publications by authors named "Chi Man Pun"

Generative methods are currently popular for medical report generation, as they automatically generate professional reports from input images, assisting physicians in making faster and more accurate decisions. However, current methods face significant challenges: 1) Lesion areas in medical images are often difficult for models to capture accurately, and 2) even when captured, these areas are frequently not described using precise clinical diagnostic terms. To address these problems, we propose a Visual-Linguistic Diagnostic Semantic Enhancement model (VLDSE) to generate high-quality reports.

View Article and Find Full Text PDF

Segmenting polyps from colonoscopy images is very important in clinical practice since it provides valuable information for colorectal cancer. However, polyp segmentation remains a challenging task as polyps have camouflage properties and vary greatly in size. Although many polyp segmentation methods have been recently proposed and produced remarkable results, most of them cannot yield stable results due to the lack of features with distinguishing properties and those with high-level semantic details.

View Article and Find Full Text PDF
Article Synopsis
  • Recent advancements in Deep Neural Networks (DNNs) have significantly improved medical image segmentation, particularly in accurately identifying lesions.
  • The traditional weighted summation operation in DNNs is insufficient for capturing crucial spatial relationships in multi-modal images, which is necessary for effective segmentation.
  • The proposed Quaternion Cross-modality Spatial Learning (Q-CSL) method introduces quaternion representation and a novel convolution technique to better explore spatial information and fuse multi-modal data, achieving strong performance with minimal computational resources.
View Article and Find Full Text PDF
Article Synopsis
  • MRI-based multi-modal brain tumor segmentation (MBTS) has gained interest due to the effectiveness of non-invasive imaging, but existing studies often struggle with limited data collection.
  • The authors introduce a novel quaternion mutual learning strategy (QMLS) that includes a voxel-wise lesion knowledge mutual learning mechanism and a quaternion multi-modal feature learning module, enhancing the model's ability to learn from sparse data.
  • QMLS significantly outperforms current methods in terms of performance and computational efficiency, making it a promising advancement for automatic brain tumor segmentation in clinical settings.
View Article and Find Full Text PDF
Article Synopsis
  • Pixels with location affinity (or "pixels of affinity") allow for improved semantic understanding in model computations, with techniques like group convolution and dilated convolution attempting to capitalize on this.
  • The proposed quaternion group convolution enhances group convolution by facilitating better communication between pixel affinities across channels, while the quaternion sawtooth wave-like dilated convolutions module (QS module) helps leverage pixel affinities both between and within layers.
  • The new method, known as the quaternion group dilated neural network (QGD-Net), effectively reduces model parameters and boosts precision in tasks like Dermoscopic Lesion Segmentation and shows promise in retinal vessel segmentation as well.
View Article and Find Full Text PDF
Article Synopsis
  • Patch-level histological tissue classification is important for analyzing histological slides, but deep learning methods often face high costs for annotating data.
  • To tackle these challenges, the authors introduce an active learning framework called ICAL, which uses two novel techniques: Incorrectness Negative Pre-training (INP) and Category-wise Curriculum Querying (CCQ) to improve category performance balance.
  • Experimental results show that ICAL can deliver nearly the same performance as full supervision using less than 16% of the labeled data while outperforming existing active learning methods, especially in achieving balanced results across categories.
View Article and Find Full Text PDF

Recent years have witnessed a significant advancement in brain imaging techniques that offer a non-invasive approach to mapping the structure and function of the brain. Concurrently, generative artificial intelligence (AI) has experienced substantial growth, involving using existing data to create new content with a similar underlying pattern to real-world data. The integration of these two domains, generative AI in neuroimaging, presents a promising avenue for exploring various fields of brain imaging and brain network computing, particularly in the areas of extracting spatiotemporal brain features and reconstructing the topological connectivity of brain networks.

View Article and Find Full Text PDF

Parkinson's disease is a common mental disease in the world, especially in the middle-aged and elderly groups. Today, clinical diagnosis is the main diagnostic method of Parkinson's disease, but the diagnosis results are not ideal, especially in the early stage of the disease. In this paper, a Parkinson's auxiliary diagnosis algorithm based on a hyperparameter optimization method of deep learning is proposed for the Parkinson's diagnosis.

View Article and Find Full Text PDF

. Accurate and automatic segmentation of medical images is crucial for improving the efficiency of disease diagnosis and making treatment plans. Although methods based on convolutional neural networks have achieved excellent results in numerous segmentation tasks of medical images, they still suffer from challenges including drastic scale variations of lesions, blurred boundaries of lesions and class imbalance.

View Article and Find Full Text PDF

In recent years, methods based on U-shaped structure and skip connection have achieved remarkable results in many medical semantic segmentation tasks. However, the information integration capability of this structure is still limited due to the incompatibility of feature maps of encoding and decoding stages at corresponding levels and lack of extraction of valid information in the final stage of encoding. This structural defect is particularly obvious in segmentation tasks with non-obvious, small and blurred-edge targets.

View Article and Find Full Text PDF

Purpose: To assist physicians in the diagnosis and treatment planning of tumor, a robust and automatic liver and tumor segmentation method is highly demanded in the clinical practice. Recently, numerous researchers have improved the segmentation accuracy of liver and tumor by introducing multiscale contextual information and attention mechanism. However, this tends to introduce more training parameters and suffer from a heavier computational burden.

View Article and Find Full Text PDF
Improved Normalized Cut for Multi-View Clustering.

IEEE Trans Pattern Anal Mach Intell

December 2022

Spectral clustering (SC) algorithms have been successful in discovering meaningful patterns since they can group arbitrarily shaped data structures. Traditional SC approaches typically consist of two sequential stages, i.e.

View Article and Find Full Text PDF

Kinship verification from facial images has been recognized as an emerging yet challenging technique in many potential computer vision applications. In this paper, we propose a novel cross-generation feature interaction learning (CFIL) framework for robust kinship verification. Particularly, an effective collaborative weighting strategy is constructed to explore the characteristics of cross-generation relations by corporately extracting features of both parents and children image pairs.

View Article and Find Full Text PDF

The artificial colony (ABC) algorithm shows a relatively powerful exploration search capability but is constrained by the curse of dimensionality, especially on nonseparable functions, where its convergence speed slows dramatically. In this article, based on an analysis of the difference between updating mechanisms that include both all-variable and one-variable updating mechanisms, we find that when equipped with the former strategy, the algorithm rapidly converges to an optimal region, while with the latter strategy, it searches the solution space thoroughly. To utilize multivariable and one-variable updating mechanisms on nonseparable and separable functions, respectively, we embed an improved linkage identification strategy into the ABC by detecting the linkage between variables more effectively.

View Article and Find Full Text PDF

Image composition is one of the most important applications in image processing. However, the inharmonious appearance between the spliced region and background degrade the quality of the image. Thus, we address the problem of Image Harmonization: Given a spliced image and the mask of the spliced region, we try to harmonize the "style" of the pasted region with the background (non-spliced region).

View Article and Find Full Text PDF

Research works in novel viewpoint synthesis are based mainly on multiview input images. In this paper, we focus on a more challenging and ill-posed problem that is to synthesize surrounding novel viewpoints from a single image. To achieve this goal, we design a full resolution network to extract fine-scale image features, which contributes to prevent blurry artifacts.

View Article and Find Full Text PDF

In this paper, a novel imbalance learning method for binary classes is proposed, named as Post-Boosting of classification boundary for Imbalanced data (PBI), which can significantly improve the performance of any trained neural networks (NN) classification boundary. The procedure of PBI simply consists of two steps: an (imbalanced) NN learning method is first applied to produce a classification boundary, which is then adjusted by PBI under the geometric mean (G-mean). For data imbalance, the geometric mean of the accuracies of both minority and majority classes is considered, that is statistically more suitable than the common metric accuracy.

View Article and Find Full Text PDF

Being a powerful appearance model, compressive random projection derives effective Haar-like features from non-rotated 4-D-parameterized rectangles, thus supporting fast and reliable object tracking. In this paper, we show that such successful fast compressive tracking scheme can be further significantly improved by structural regularization and online data-driven sampling. Our major contribution is threefold.

View Article and Find Full Text PDF

Joint sparse representation (JSR) has shown great potential in various image processing and computer vision tasks. Nevertheless, the conventional JSR is fragile to outliers. In this paper, we propose a weighted JSR (WJSR) model to simultaneously encode a set of data samples that are drawn from the same subspace but corrupted with noise and outliers.

View Article and Find Full Text PDF

Chaotic maps are widely used in different applications. Motivated by the cascade structure in electronic circuits, this paper introduces a general chaotic framework called the cascade chaotic system (CCS). Using two 1-D chaotic maps as seed maps, CCS is able to generate a huge number of new chaotic maps.

View Article and Find Full Text PDF
Article Synopsis
  • The proposed system uses a superpixel-based approach to track hand gestures drawn in free air by analyzing their motion trajectories in video sequences.
  • It employs motion detection and unsupervised image segmentation to identify the moving hand, while also creating a hand appearance model using surrounding superpixels.
  • The tracking algorithm is robust against challenges like hand deformation and background confusion, achieving high recognition accuracy of 99.17% for easy gestures and 98.57% for harder gestures, outperforming existing methods.
View Article and Find Full Text PDF

Wide availability of image processing software makes counterfeiting become an easy and low-cost way to distort or conceal facts. Driven by great needs for valid forensic technique, many methods have been proposed to expose such forgeries. In this paper, we proposed an integrated algorithm which was able to detect two commonly used fraud practices: copy-move and splicing forgery in digital picture.

View Article and Find Full Text PDF

An effective shift invariant wavelet feature extraction method for classification of images with different sizes is proposed. The feature extraction process involves a normalization followed by an adaptive shift invariant wavelet packet transform. An energy signature is computed for each subband of these invariant wavelet coefficients.

View Article and Find Full Text PDF