Publications by authors named "William K Cheung"

Single-cell ribonucleic acid sequencing (scRNA-seq) enables the quantification of gene expression at the transcriptomic level with single-cell resolution, enhancing our understanding of cellular heterogeneity. However, the excessive missing values present in scRNA-seq data hinder downstream analysis. While numerous imputation methods have been proposed to recover scRNA-seq data, high imputation performance often comes with low or no interpretability.

View Article and Find Full Text PDF

Despite the recent success of deep learning models for text generation, generating clinically accurate reports remains challenging. More precisely modeling the relationships of the abnormalities revealed in an X-ray image has been found promising to enhance the clinical accuracy. In this paper, we first introduce a novel knowledge graph structure called an attributed abnormality graph (ATAG).

View Article and Find Full Text PDF

Time-course single-cell RNA sequencing (scRNA-seq) data have been widely used to explore dynamic changes in gene expression of transcription factors (TFs) and their target genes. This information is useful to reconstruct cell-type-specific gene regulatory networks (GRNs). However, the existing tools are commonly designed to analyze either time-course bulk gene expression data or static scRNA-seq data via pseudo-time cell ordering.

View Article and Find Full Text PDF

Variational autoencoders (VAEs) are a class of effective deep generative models, with the objective to approximate the true, but unknown data distribution. VAEs make use of latent variables to capture high-level semantics so as to reconstruct the data well with the help of informative latent variables. Yet, training VAEs tends to suffer from posterior collapse, when the decoder is parameterized by an autoregressive model for sequence generation.

View Article and Find Full Text PDF

Metagenomic sequencing provides a culture-independent avenue to investigate the complex microbial communities by constructing metagenome-assembled genomes (MAGs). A MAG represents a microbial genome by a group of sequences from genome assembly with similar characteristics. It enables us to identify novel species and understand their potential functions in a dynamic ecosystem.

View Article and Find Full Text PDF

Single-cell RNA sequencing has enabled to capture the gene activities at single-cell resolution, thus allowing reconstruction of cell-type-specific gene regulatory networks (GRNs). The available algorithms for reconstructing GRNs are commonly designed for bulk RNA-seq data, and few of them are applicable to analyze scRNA-seq data by dealing with the dropout events and cellular heterogeneity. In this paper, we represent the joint gene expression distribution of a gene pair as an image and propose a novel supervised deep neural network called DeepDRIM which utilizes the image of the target TF-gene pair and the ones of the potential neighbors to reconstruct GRN from scRNA-seq data.

View Article and Find Full Text PDF

Objective: Current guidelines have different recommendations on applying pharmacological interventions for managing cancer-related fatigue (CRF) among cancer survivors. This systematic review aims to synthesize clinical evidence on pharmacological interventions for managing CRF.

Methods: Five databases were searched for potential randomized controlled trials (RCTs) from their inception until October 2020.

View Article and Find Full Text PDF

Binary data with one-class missing values are ubiquitous in real-world applications. They can be represented by irregular tensors with varying sizes in one dimension, where value one means presence of a feature while zero means unknown (i.e.

View Article and Find Full Text PDF

Background: Existing systematic reviews have compared the effectiveness of nurse-led peri-discharge interventions comprising different components with usual care on reducing all-cause 30-day hospital readmissions. However, conflicting results were reported.

Objective: We conducted a network meta-analysis to evaluate the comparative effectiveness of different nurse-led peri-discharge interventions, compared with usual care, for reducing all-cause 30-day hospital readmissions.

View Article and Find Full Text PDF

Clinical time series imputation is recognized as an essential task in clinical data analytics. Most models rely either on strong assumptions regarding the underlying data-generation process or on preservation of only local properties without effective consideration of global dependencies. To advance the state of the art in clinical time series imputation, we participated in the 2019 ICHI Data Analytics Challenge on Missing Data Imputation (DACMI).

View Article and Find Full Text PDF

Objective: To synthesize evidence on the effectiveness of acupuncture and related therapies for primary carpal tunnel syndrome (CTS) by conducting a systematic review of randomized controlled trials (RCTs).

Data Sources: Nine databases were searched for potential RCTs from their inception till July 2019.

Review Methods: RCTs which reported at least one of the three outcomes were included: symptom severity, functional status and pain.

View Article and Find Full Text PDF

Background And Aim: Treatment options for functional dyspepsia (FD) refractory to pharmacological treatments are limited but the effectiveness of electroacupuncture (EA) is uncertain. We assessed the effectiveness of EA combined with on-demand gastrocaine.

Methods: We conducted a single-center, assessor-blind, randomized parallel-group 2-arm trial on Helicobacter pylori negative FD patients of the postprandial distress syndrome subtype refractory to proton pump inhibitor, prokinetics, or H antagonists.

View Article and Find Full Text PDF

Authorship analysis (AA) is the study of unveiling the hidden properties of authors from textual data. It extracts an author's identity and sociolinguistic characteristics based on the reflected writing styles in the text. The process is essential for various areas, such as cybercrime investigation, psycholinguistics, political socialization, etc.

View Article and Find Full Text PDF

Diffusion on social networks refers to the process where opinions are spread via the connected nodes. Given a set of observed information cascades, one can infer the underlying diffusion process for social network analysis. The independent cascade model (IC model) is a widely adopted diffusion model where a node is assumed to be activated independently by any one of its neighbors.

View Article and Find Full Text PDF

The CRISPR/Cas9 system is a powerful genome editing tool and has been widely used for biomedical research. However, many challenges, such as off-target effects and lack of easy solutions for multiplex targeting, are still limiting its applications. To overcome these challenges, we first developed a highly efficient doxycycline-inducible Cas9-EGFP vector.

View Article and Find Full Text PDF

Interferon-induced transmembrane protein 1 (IFITM1) is one of the interferon-induced transmembrane protein family members. In this study, we reported that the elevated IFITM1 expression in human colorectal cancer (CRC) significantly correlated with CRC lymph node and distance metastasis as well as a more advanced clinical stage. Importantly, elevated IFITM1 expression is an independent prognostic factor for poor survival.

View Article and Find Full Text PDF

Background: In a medical data set, data are commonly composed of a minority (positive or abnormal) group and a majority (negative or normal) group and the cost of misclassifying a minority sample as a majority sample is highly expensive. This is the so-called imbalanced classification problem. The traditional classification functions can be seriously affected by the skewed class distribution in the data.

View Article and Find Full Text PDF

The transmission of infectious diseases can be affected by many or even hidden factors, making it difficult to accurately predict when and where outbreaks may emerge. One approach at the moment is to develop and deploy surveillance systems in an effort to detect outbreaks as timely as possible. This enables policy makers to modify and implement strategies for the control of the transmission.

View Article and Find Full Text PDF

Metastasis is a major clinical challenge for cancer treatment. Emerging evidence suggests that aberrant epigenetic modifications contribute significantly to tumor formation and progression. However, the drivers and roles of such epigenetic changes in tumor metastasis are still poorly understood.

View Article and Find Full Text PDF

Background: To investigate transmission patterns of an infectious disease, e.g., malaria, it is desirable to use the observed surveillance data to discover the underlying (often hidden) disease transmission networks.

View Article and Find Full Text PDF

Molecular programs that mediate normal cell differentiation are required for oncogenesis and tumor cell survival in certain cancers. How cell-lineage-restricted genes specifically influence metastasis is poorly defined. In lung cancers, we uncovered a transcriptional program that is preferentially associated with distal airway epithelial differentiation and lung adenocarcinoma (ADC) progression.

View Article and Find Full Text PDF

Latent Dirichlet allocation (LDA) is an important hierarchical Bayesian model for probabilistic topic modeling, which attracts worldwide interest and touches on many important applications in text mining, computer vision and computational biology. This paper represents the collapsed LDA as a factor graph, which enables the classic loopy belief propagation (BP) algorithm for approximate inference and parameter estimation. Although two commonly used approximate inference methods, such as variational Bayes (VB) and collapsed Gibbs sampling (GS), have gained great success in learning LDA, the proposed BP is competitive in both speed and accuracy, as validated by encouraging experimental results on four large-scale document datasets.

View Article and Find Full Text PDF

Malaria transmission can be affected by multiple or even hidden factors, making it difficult to timely and accurately predict the impact of elimination and eradication programs that have been undertaken and the potential resurgence and spread that may continue to emerge. One approach at the moment is to develop and deploy surveillance systems in an attempt to identify them as timely as possible and thus to enable policy makers to modify and implement strategies for further preventing the transmission. Most of the surveillance data will be of temporal and spatial nature.

View Article and Find Full Text PDF