Publications by authors named "Hon Nian Chua"

Article Synopsis
  • - Introns in the 5' untranslated regions (5'UTRs) are present in about 35% of human transcripts, but certain genes—like those coding for secreted and membrane-bound proteins—are less likely to have them.
  • - A classifier developed for this study predicts the status of 5'UTR introns with over 80% accuracy based on early coding region sequences, identifying a group of transcripts termed "5IM" that show distinct characteristics.
  • - Transcripts in the 5IM class, which make up around 20% of human transcripts, exhibit specific features like non-AUG start codons, enriched secondary structures, greater reliance on translation factors, and unique binding patterns, indicating a
View Article and Find Full Text PDF

Background: The human habitat is a host where microbial species evolve, function, and continue to evolve. Elucidating how microbial communities respond to human habitats is a fundamental and critical task, as establishing baselines of human microbiome is essential in understanding its role in human disease and health. Recent studies on healthy human microbiome focus on particular body habitats, assuming that microbiome develop similar structural patterns to perform similar ecosystem function under same environmental conditions.

View Article and Find Full Text PDF

Threatened preterm labor (TPTL) accounts for ∼30% of pregnancy-related hospital admissions. Maternal peripheral leukocytes can be used to monitor a variety of physiological processes occurring in the body. Two high-throughput mass spectrometry methodologies, SWATH and iTRAQ, were used to study differentially expressed peripheral blood leukocyte lysate proteins in symptomatic women admitted for TPTL who had a preterm birth within 48 h (n = 16) and those who did not (n = 24).

View Article and Find Full Text PDF

Protein-protein interactions (PPIs) are important for understanding the cellular mechanisms of biological functions, but the reliability of PPIs extracted by high-throughput assays is known to be low. To address this, many current methods use multiple evidence from different sources of information to compute reliability scores for such PPIs. However, they often combine the evidence without taking into account the uncertainty of the evidence values, potential dependencies between the information sources used and missing values from some information sources.

View Article and Find Full Text PDF

Threatened preterm labor (TPTL) is defined as persistent premature uterine contractions between 20 and 37 weeks of gestation and is the most common condition that requires hospitalization during pregnancy. Most of these TPTL women continue their pregnancies to term while only an estimated 5% will deliver a premature baby within ten days. The aim of this work was to study differential whole blood gene expression associated with spontaneous preterm birth (sPTB) within 48 hours of hospital admission.

View Article and Find Full Text PDF

An increasing number of genes have been experimentally confirmed in recent years as causative genes to various human diseases. The newly available knowledge can be exploited by machine learning methods to discover additional unknown genes that are likely to be associated with diseases. In particular, positive unlabeled learning (PU learning) methods, which require only a positive training set P (confirmed disease genes) and an unlabeled set U (the unknown candidate genes) instead of a negative training set N, have been shown to be effective in uncovering new disease genes in the current scenario.

View Article and Find Full Text PDF

Comprehensive functional annotation of vertebrate genomes is fundamental to biological discovery. Reverse genetic screening has been highly useful for determination of gene function, but is untenable as a systematic approach in vertebrate model organisms given the number of surveyable genes and observable phenotypes. Unbiased prediction of gene-phenotype relationships offers a strategy to direct finite experimental resources towards likely phenotypes, thus maximizing de novo discovery of gene functions.

View Article and Find Full Text PDF

Background: Protein complexes participate in many important cellular functions, so finding the set of existent complexes is essential for understanding the organization and regulation of processes in the cell. With the availability of large amounts of high-throughput protein-protein interaction (PPI) data, many algorithms have been proposed to discover protein complexes from PPI networks. However, such approaches are hindered by the high rate of noise in high-throughput PPI data, including spurious and missing interactions.

View Article and Find Full Text PDF

The body of human genomic and proteomic evidence continues to grow at ever-increasing rates, while annotation efforts struggle to keep pace. A surprisingly small fraction of human genes have clear, documented associations with specific functions, and new functions continue to be found for characterized genes. Here we assembled an integrated collection of diverse genomic and proteomic data for 21,341 human genes and make quantitative associations of each to 4333 Gene Ontology terms.

View Article and Find Full Text PDF

Background: Protein complexes are important for understanding principles of cellular organization and functions. With the availability of large amounts of high-throughput protein-protein interactions (PPI), many algorithms have been proposed to discover protein complexes from PPI networks. However, existing algorithms generally do not take into consideration the fact that not all the interactions in a PPI network take place at the same time.

View Article and Find Full Text PDF

Drug synergy allows a therapeutic effect to be achieved with lower doses of component drugs. Drug synergy can result when drugs target the products of genes that act in parallel pathways ('specific synergy'). Such cases of drug synergy should tend to correspond to synergistic genetic interaction between the corresponding target genes.

View Article and Find Full Text PDF

Computational systems biology is empowering the study of drug action. Studies on biological effects of chemical compounds have increased in scale and accessibility, allowing integration with other large-scale experimental data types. Here, we review computational approaches for elucidating the mechanisms of both intended and undesirable effects of drugs, with the collective potential to change the nature of drug discovery and pharmacological therapy.

View Article and Find Full Text PDF
Article Synopsis
  • In higher eukaryotes, mRNAs are exported from the nucleus to the cytoplasm, primarily through splicing-related factors at the 5' end, but an alternative pathway (ALREX) exists that doesn’t require splicing.
  • The study proposes that the choice of the mRNA export pathway is influenced by the presence of upstream elements like introns or signal sequences (SSCRs) in the transcript.
  • It also found that nuclear-encoded mitochondrial genes can utilize the ALREX pathway and identified common motifs in both SSCRs and mitochondrial-targeting sequence coding regions (MSCRs), indicating a shared regulatory mechanism for mRNA export among different gene types.
View Article and Find Full Text PDF

Background: Protein-protein interactions (PPIs) play important roles in various cellular processes. However, the low quality of current PPI data detected from high-throughput screening techniques has diminished the potential usefulness of the data. We need to develop a method to address the high data noise and incompleteness of PPI data, namely, to filter out inaccurate protein interactions (false positives) and predict putative protein interactions (false negatives).

View Article and Find Full Text PDF

Motivation: Protein complexes are important for understanding principles of cellular organization and function. High-throughput experimental techniques have produced a large amount of protein interactions, which makes it possible to predict protein complexes from protein-protein interaction (PPI) networks. However, protein interaction data produced by high-throughput experiments are often associated with high false positive and false negative rates, which makes it difficult to predict complexes accurately.

View Article and Find Full Text PDF

The protein-protein subnetwork prediction challenge presented at the 2nd Dialogue for Reverse Engineering Assessments and Methods (DREAM2) conference is an important computational problem essential to proteomic research. Given a set of proteins from the Saccharomyces cerevisiae (baker's yeast) genome, the task is to rank all possible interactions between the proteins from the most likely to the least likely. To tackle this task, we adopt a graph-based strategy to combine multiple sources of biological data and computational predictions.

View Article and Find Full Text PDF

Protein interactions are crucial components of all cellular processes. An in-depth knowledge of the full complement of protein interactions in a cell, therefore, provides insight into the structure, properties and functions of the cell and its components. An accurate and comprehensive protein interaction network is, thus, an invaluable framework to study protein regulation in disease.

View Article and Find Full Text PDF

Protein complexes are fundamental for understanding principles of cellular organizations. As the sizes of protein-protein interaction (PPI) networks are increasing, accurate and fast protein complex prediction from these PPI networks can serve as a guide for biological experiments to discover novel protein complexes. However, it is not easy to predict protein complexes from PPI networks, especially in situations where the PPI network is noisy and still incomplete.

View Article and Find Full Text PDF

Motivation: With the increasing availability of diverse biological information, protein function prediction approaches have converged towards integration of heterogeneous data. Many adapted existing techniques, such as machine-learning and probabilistic methods, which have proven successful on specific data types. However, the impact of these approaches is hindered by a couple of factors.

View Article and Find Full Text PDF

Protein complexes are fundamental for understanding principles of cellular organizations. Accurate and fast protein complex prediction from the PPI networks of increasing sizes can serve as a guide for biological experiments to discover novel protein complexes. However, protein complex prediction from PPI networks is a hard problem, especially in situations where the PPI network is noisy.

View Article and Find Full Text PDF

Background: Protein-protein interaction has been used to complement traditional sequence homology to elucidate protein function. Most existing approaches only make use of direct interactions to infer function, and some have studied the application of indirect interactions for functional inference but are unable to improve prediction performance. We have previously proposed an approach, FS-Weighted Averaging, which uses topological weighting and level-2 indirect interactions (protein pairs connected via two interactions) for predicting protein function from protein interactions and have found that it yields predictions with superior precision on yeast proteins over existing approaches.

View Article and Find Full Text PDF

High-throughput experimental methods, such as yeast-two-hybrid and phage display, have fairly high levels of false positives (and false negatives). Thus the list of protein-protein interactions detected by such experiments would need additional wet laboratory validation. It would be useful if the list could be prioritized in some way.

View Article and Find Full Text PDF

Motivation: Most approaches in predicting protein function from protein-protein interaction data utilize the observation that a protein often share functions with proteins that interacts with it (its level-1 neighbours). However, proteins that interact with the same proteins (i.e.

View Article and Find Full Text PDF