Publications by authors named "Ngai-Fong Law"

Source camera identification has long been a hot topic in the field of image forensics. Besides conventional feature engineering algorithms developed based on studying the traces left upon shooting, several deep-learning-based methods have also emerged recently. However, identification performance is susceptible to image content and is far from satisfactory for small image patches in real demanding applications.

View Article and Find Full Text PDF

Due to the advancement of DNA sequencing techniques, the number of sequenced individual genomes has experienced an exponential growth. Thus, effective compression of this kind of sequences is highly desired. In this work, we present a novel compression algorithm called Reference-based Compression algorithm using the concept of Clustering (RCC).

View Article and Find Full Text PDF

Traditionally, intra-sequence similarity is exploited for compressing a single DNA sequence. Recently, remarkable compression performance of individual DNA sequence from the same population is achieved by encoding its difference with a nearly identical reference sequence. Nevertheless, there is lack of general algorithms that also allow less similar reference sequences.

View Article and Find Full Text PDF

DNA microarray experiment unavoidably generates gene expression data with missing values. This hardens subsequent analysis such as biclusters detection which aims to find a set of co-expressed genes under some experimental conditions. Missing values are thus required to be estimated before biclusters detection.

View Article and Find Full Text PDF

In DNA microarray experiments, discovering groups of genes that share similar transcriptional characteristics is instrumental in functional annotation, tissue classification and motif identification. However, in many situations a subset of genes only exhibits a consistent pattern over a subset of conditions. Although used extensively in gene expression data analysis, conventional clustering algorithms that consider the entire row or column in an expression matrix can therefore fail to detect useful patterns in the data.

View Article and Find Full Text PDF

Microarray gene expression data generally suffers from missing value problem due to a variety of experimental reasons. Since the missing data points can adversely affect downstream analysis, many algorithms have been proposed to impute missing values. In this survey, we provide a comprehensive review of existing missing value imputation algorithms, focusing on their underlying algorithmic techniques and how they utilize local or global information from within the data, or their use of domain knowledge during imputation.

View Article and Find Full Text PDF

Current DNA compression algorithms work by finding similar repeated regions within the DNA sequence and then encoding these regions together to achieve compression. Our study on chromosome sequence similarity reveals that the length of similar repeated regions within one chromosome is about 4.5% of the total sequence length.

View Article and Find Full Text PDF

Background: The DNA microarray technology allows the measurement of expression levels of thousands of genes under tens/hundreds of different conditions. In microarray data, genes with similar functions usually co-express under certain conditions only 1. Thus, biclustering which clusters genes and conditions simultaneously is preferred over the traditional clustering technique in discovering these coherent genes.

View Article and Find Full Text PDF

The multiscale directional filter bank (MDFB) improves the radial frequency resolution of the contourlet transform by introducing an additional decomposition in the high-frequency band. The increase in frequency resolution is particularly useful for texture description because of the quasi-periodic property of textures. However, the MDFB needs an extra set of scale and directional decomposition, which is performed on the full image size.

View Article and Find Full Text PDF

Z-curve features are one of the popular features used in exon/intron classification. We showed that although both Z-curve and Fourier approaches are based on detecting 3-periodicity in coding regions, there are significant differences in their spectral formulation. From the spectral formulation of the Z-curve, we obtained three modified sequences that characterize different biological properties.

View Article and Find Full Text PDF