Publications by authors named "Cheekeong Kwoh"

Studies have shown that protein glycosylation in cells reflects the real-time dynamics of biological processes, and the occurrence and development of many diseases are closely related to protein glycosylation. Abnormal protein glycosylation can be used as a potential diagnostic and prognostic marker of a disease, as well as a therapeutic target and a new breakthrough point for exploring pathogenesis. To address the issue of significant differences in the prediction results of previous models for different species, we constructed a hybrid deep learning model N-GlycoPred on the basis of dual-layer convolution, a paired attention mechanism and BiLSTM for accurate identification of N-glycosylation sites.

View Article and Find Full Text PDF

Learning time-series representations when only unlabeled data or few labeled samples are available can be a challenging task. Recently, contrastive self-supervised learning has shown great improvement in extracting useful representations from unlabeled data via contrasting different augmented views of data. In this work, we propose a novel Time-Series representation learning framework via Temporal and Contextual Contrasting (TS-TCC) that learns representations from unlabeled data with contrastive learning.

View Article and Find Full Text PDF

The past few years have witnessed a remarkable advance in deep learning for EEG-based sleep stage classification (SSC). However, the success of these models is attributed to possessing a massive amount of labeled data for training, limiting their applicability in real-world scenarios. In such scenarios, sleep labs can generate a massive amount of data, but labeling can be expensive and time-consuming.

View Article and Find Full Text PDF

Unsupervised domain adaptation (UDA) has successfully addressed the domain shift problem for visual applications. Yet, these approaches may have limited performance for time series data due to the following reasons. First, they mainly rely on the large-scale dataset (i.

View Article and Find Full Text PDF

The rapid emergence of high-dimensional data in various areas has brought new challenges to current ensemble clustering research. To deal with the curse of dimensionality, recently considerable efforts in ensemble clustering have been made by means of different subspace-based techniques. However, besides the emphasis on subspaces, rather limited attention has been paid to the potential diversity in similarity/dissimilarity metrics.

View Article and Find Full Text PDF

Automatic sleep stage mymargin classification is of great importance to measure sleep quality. In this paper, we propose a novel attention-based deep learning architecture called AttnSleep to classify sleep stages using single channel EEG signals. This architecture starts with the feature extraction module based on multi-resolution convolutional neural network (MRCNN) and adaptive feature recalibration (AFR).

View Article and Find Full Text PDF

Background: A large body of evidence shows that miRNA regulates the expression of its target genes at post-transcriptional level and the dysregulation of miRNA is related to many complex human diseases. Accurately discovering disease-related miRNAs is conductive to the exploring of the pathogenesis and treatment of diseases. However, because of the limitation of time-consuming and expensive experimental methods, predicting miRNA-disease associations by computational models has become a more economical and effective mean.

View Article and Find Full Text PDF

The molecular components with the functional interdependencies in human cell form complicated biological network. Diseases are mostly caused by the perturbations of the composite of the interaction multi-biomolecules, rather than an abnormality of a single biomolecule. Furthermore, new biological functions and processes could be revealed by discovering novel biological entity relationships.

View Article and Find Full Text PDF

Background: The evolution of influenza A viruses leads to the antigenic changes. Serological diagnosis of the antigenicity is usually labor-intensive, time-consuming and not suitable for early-stage detection. Computational prediction of the antigenic relationship between emerging and old strains of influenza viruses using viral sequences can facilitate large-scale antigenic characterization, especially for those viruses requiring high biosafety facilities, such as H5 and H7 influenza A viruses.

View Article and Find Full Text PDF

Background: Predicting disease causative genes (or simply, disease genes) has played critical roles in understanding the genetic basis of human diseases and further providing disease treatment guidelines. While various computational methods have been proposed for disease gene prediction, with the recent increasing availability of biological information for genes, it is highly motivated to leverage these valuable data sources and extract useful information for accurately predicting disease genes.

Results: We present an integrative framework called N2VKO to predict disease genes.

View Article and Find Full Text PDF

Therapeutic effects of drugs are mediated via interactions between them and their intended targets. As such, prediction of drug-target interactions is of great importance. Drug-target interaction prediction is especially relevant in the case of drug repositioning where attempts are made to repurpose old drugs for new indications.

View Article and Find Full Text PDF

Combinatorial therapy may reduce drug side effects and improve drug efficacy, making combination therapy a promising strategy to treat complex diseases. However, in the existing computational methods, the natural properties and network knowledge of drugs have not been adequately and simultaneously considered, making it difficult to identify effective drug combinations. Computational methods that incorporate multiple sources of information (biological, chemical, pharmacological, and network knowledge) offer more opportunities to screen synergistic drug combinations.

View Article and Find Full Text PDF

Background: Influenza viruses are undergoing continuous and rapid evolution. The fatal influenza A/H7N9 has drawn attention since the first wave of infections in March 2013, and raised more grave concerns with its increased potential to spread among humans. Experimental studies have revealed several host and virulence markers, indicating differential host binding preferences which can help estimate the potential of causing a pandemic.

View Article and Find Full Text PDF

Computational prediction of drug-target interactions (DTIs) has become an essential task in the drug discovery process. It narrows down the search space for interactions by suggesting potential interaction candidates for validation via wet-lab experiments that are well known to be expensive and time-consuming. In this article, we aim to provide a comprehensive overview and empirical evaluation on the computational DTI prediction techniques, to act as a guide and reference for our fellow researchers.

View Article and Find Full Text PDF

HIV polyprotein Gag is increasingly found to contribute to protease inhibitor resistance. Despite its role in viral maturation and in developing drug resistance, there remain gaps in the knowledge of the role of certain Gag subunits (e.g.

View Article and Find Full Text PDF

"Virtual Screening" is a common step of in silico drug design, where researchers screen a large library of small molecules (ligands) for interesting hits, in a process known as "Docking". However, docking is a computationally intensive and time-consuming process, usually restricted to small size binding sites (pockets) and small number of interacting residues. When the target site is not known (blind docking), researchers split the docking box into multiple boxes, or repeat the search several times using different seeds, and then merge the results manually.

View Article and Find Full Text PDF

Experimental prediction of drug-target interactions is expensive, time-consuming and tedious. Fortunately, computational methods help narrow down the search space for interaction candidates to be further examined via wet-lab techniques. Nowadays, the number of attributes/features for drugs and targets, as well as the amount of their interactions, are increasing, making these computational methods inefficient or occasionally prohibitive.

View Article and Find Full Text PDF

Background: Multiple computational methods for predicting drug-target interactions have been developed to facilitate the drug discovery process. These methods use available data on known drug-target interactions to train classifiers with the purpose of predicting new undiscovered interactions. However, a key challenge regarding this data that has not yet been addressed by these methods, namely class imbalance, is potentially degrading the prediction performance.

View Article and Find Full Text PDF

Background: The human influenza viruses undergo rapid evolution (especially in hemagglutinin (HA), a glycoprotein on the surface of the virus), which enables the virus population to constantly evade the human immune system. Therefore, the vaccine has to be updated every year to stay effective. There is a need to characterize the evolution of influenza viruses for better selection of vaccine candidates and the prediction of pandemic strains.

View Article and Find Full Text PDF

Ligand binding pockets in proteins contain water molecules, which play important roles in modulating protein-ligand interactions. Available crystallographic data for the 5' mRNA cap-binding pocket of the translation initiation factor protein eIF4E shows several structurally conserved waters, which also persist in molecular dynamics simulations. These waters engage an intricate hydrogen-bond network between the cap and protein.

View Article and Find Full Text PDF

Experimental determination of drug-target interactions is expensive and time-consuming. Therefore, there is a continuous demand for more accurate predictions of interactions using computational techniques. Algorithms have been devised to infer novel interactions on a global scale where the input to these algorithms is a drug-target network (i.

View Article and Find Full Text PDF

Interindividual variability in the epigenome has gained tremendous attention for its potential in pathophysiological investigation, disease diagnosis, and evaluation of clinical intervention. DNA methylation is the most studied epigenetic mark in epigenome-wide association studies (EWAS) as it can be detected from limited starting material. Infinium 450K methylation array is the most popular platform for high-throughput profiling of this mark in clinical samples, as it is cost-effective and requires small amounts of DNA.

View Article and Find Full Text PDF

De novo clustering is a popular technique to perform taxonomic profiling of a microbial community by grouping 16S rRNA amplicon reads into operational taxonomic units (OTUs). In this work, we introduce a new dendrogram-based OTU clustering pipeline called CRiSPy. The key idea used in CRiSPy to improve clustering accuracy is the application of an anomaly detection technique to obtain a dynamic distance cutoff instead of using the de facto value of 97 percent sequence similarity as in most existing OTU clustering pipelines.

View Article and Find Full Text PDF

Motivation: The need for efficient molecular docking tools for high-throughput screening is growing alongside the rapid growth of drug-fragment databases. AutoDock Vina ('Vina') is a widely used docking tool with parallelization for speed. QuickVina ('QVina 1') then further enhanced the speed via a heuristics, requiring high exhaustiveness.

View Article and Find Full Text PDF

Background: Protein-protein docking is an in silico method to predict the formation of protein complexes. Due to limited computational resources, the protein-protein docking approach has been developed under the assumption of rigid docking, in which one of the two protein partners remains rigid during the protein associations and water contribution is ignored or implicitly presented. Despite obtaining a number of acceptable complex predictions, it seems to-date that most initial rigid docking algorithms still find it difficult or even fail to discriminate successfully the correct predictions from the other incorrect or false positive ones.

View Article and Find Full Text PDF