Publications by authors named "Harvard Wai Hann Hui"

Article Synopsis
  • Batch effects add variability to high-dimensional data, making accurate analysis difficult and possibly leading to wrong conclusions if not handled properly.
  • Despite advancements in technology and algorithms, managing batch effects effectively is still challenging and requires careful planning.
  • The paper emphasizes the need for a flexible approach in choosing batch effect correction algorithms, highlighting challenges like hidden batch factors, design imbalances, and the risks of over-correction, ultimately aiming to help researchers improve the reliability of their data analyses.
View Article and Find Full Text PDF

Missing values (MVs) can adversely impact data analysis and machine-learning model development. We propose a novel mixed-model method for missing value imputation (MVI). This method, ProJect (short for Protein inJection), is a powerful and meaningful improvement over existing MVI methods such as Bayesian principal component analysis (PCA), probabilistic PCA, local least squares and quantile regression imputation of left-censored data.

View Article and Find Full Text PDF

In data-processing pipelines, upstream steps can influence downstream processes because of their sequential nature. Among these data-processing steps, batch effect (BE) correction (BEC) and missing value imputation (MVI) are crucial for ensuring data suitability for advanced modeling and reducing the likelihood of false discoveries. Although BEC-MVI interactions are not well studied, they are ultimately interdependent.

View Article and Find Full Text PDF

Data analysis is complex due to a myriad of technical problems. Amongst these, missing values and batch effects are endemic. Although many methods have been developed for missing value imputation (MVI) and batch correction respectively, no study has directly considered the confounding impact of MVI on downstream batch correction.

View Article and Find Full Text PDF

Proteomics data are often plagued with missingness issues. These missing values (MVs) threaten the integrity of subsequent statistical analyses by reduction of statistical power, introduction of bias, and failure to represent the true sample. Over the years, several categories of missing value imputation (MVI) methods have been developed and adapted for proteomics data.

View Article and Find Full Text PDF