Publications by authors named "Tu Bao Ho"

Sentiments associated with assessments and observations recorded in a clinical narrative can often indicate a patient's health status. To perform sentiment analysis on clinical narratives, domain-specific knowledge concerning meanings of medical terms is required. In this study, semantic types in the Unified Medical Language System (UMLS) are exploited to improve lexicon-based sentiment classification methods.

View Article and Find Full Text PDF

The existence of massive quantity of clinical text in electronic medical records (EMRs) has created significant demand for clinical text processing and information extraction in the field of health care and medical research. Detailed clinical observations of patients are typically recorded chronologically. Temporal information in such clinical texts consist of three elements: temporal expressions, temporal events, and temporal relations.

View Article and Find Full Text PDF

Background: As more and more researchers are turning to big data for new opportunities of biomedical discoveries, machine learning models, as the backbone of big data analysis, are mentioned more often in biomedical journals. However, owing to the inherent complexity of machine learning methods, they are prone to misuse. Because of the flexibility in specifying machine learning models, the results are often insufficiently reported in research articles, hindering reliable assessment of model validity and consistent interpretation of model outputs.

View Article and Find Full Text PDF

Background: Many factors that directly or indirectly cause adverse drug reaction (ADRs) varying from pharmacological, immunological and genetic factors to ethnic, age, gender, social factors as well as drug and disease related ones. On the other hand, advanced methods of statistics, machine learning and data mining allow the users to more effectively analyze the data for descriptive and predictive purposes. The fast changes in this field make it difficult to follow the research progress and context on ADR detection and prediction.

View Article and Find Full Text PDF

Background: Short interfering RNAs (siRNAs) can knockdown target genes and thus have an immense impact on biology and pharmacy research. The key question of which siRNAs have high knockdown ability in siRNA research remains challenging as current known results are still far from expectation.

Results: This work aims to develop a generic framework to enhance siRNA knockdown efficacy prediction.

View Article and Find Full Text PDF

We develop a method that combines data mining and first principles calculation to guide the designing of distorted cubane Mn(4+)Mn3(3+) single molecule magnets. The essential idea of the method is a process consisting of sparse regressions and cross-validation for analyzing calculated data of the materials. The method allows us to demonstrate that the exchange coupling between Mn(4+) and Mn(3+) ions can be predicted from the electronegativities of constituent ligands and the structural features of the molecule by a linear regression model with high accuracy.

View Article and Find Full Text PDF

Antarctic bacterium antifreeze proteins (AFPs) protect and support the survival of cold-adapted organisms by binding and inhibiting the growth of ice crystals. The mechanism of the anti-freezing process in a water environment at low temperature of Antarctic bacterium AFPs remains unclear. In this research, we study the effects of Antarctic bacterium AFPs by coarse grained simulations solution at a temperature range from 262 to 273 K.

View Article and Find Full Text PDF

Unlabelled: Eukaryotic gene transcription is a complex process, which requires the orchestrated recruitment of a large number of proteins, such as sequence-specific DNA binding factors, chromatin remodelers and modifiers, and general transcription machinery, to regulatory regions. Previous works have shown that these regulatory proteins favor specific organizational theme along promoters. Details about how they cooperatively regulate transcriptional process, however, remain unclear.

View Article and Find Full Text PDF

Objective: Predicting or prioritizing the human genes that cause disease, or "disease genes", is one of the emerging tasks in biomedicine informatics. Research on network-based approach to this problem is carried out upon the key assumption of "the network-neighbour of a disease gene is likely to cause the same or a similar disease", and mostly employs data regarding well-known disease genes, using supervised learning methods. This work aims to find an effective method to exploit the disease gene neighbourhood and the integration of several useful omics data sources, which potentially enhance disease gene predictions.

View Article and Find Full Text PDF

Background: Nucleosome, the fundamental unit of chromatin, is formed by wrapping nearly 147bp of DNA around an octamer of histone proteins. This histone core has many variants that are different from each other by their biochemical compositions as well as biological functions. Although the deposition of histone variants onto chromatin has been implicated in many important biological processes, such as transcription and replication, the mechanisms of how they are deposited on target sites are still obscure.

View Article and Find Full Text PDF

MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression at the post-transcriptional level. They play an important role in several biological processes such as cell development and differentiation. Similar to transcription factors (TFs), miRNAs regulate gene expression in a combinatorial fashion, i.

View Article and Find Full Text PDF

Background: Eukaryotic genomes are packaged into chromatin, a compact structure containing fundamental repeating units, the nucleosomes. The mobility of nucleosomes plays important roles in many DNA-related processes by regulating the accessibility of regulatory elements to biological machineries. Although it has been known that various factors, such as DNA sequences, histone modifications, and chromatin remodelling complexes, could affect nucleosome stability, the mechanisms of how they regulate this stability are still unclear.

View Article and Find Full Text PDF

Background: MicroRNAs (miRNAs) are a class of small non-coding RNA molecules (20-24 nt), which are believed to participate in repression of gene expression. They play important roles in several biological processes (e.g.

View Article and Find Full Text PDF

Protein-protein interactions (PPIs) are intrinsic to almost all cellular processes. Different computational methods offer new chances to study PPIs. To predict PPIs, while the integrative methods use multiple data sources instead of a single source, the domain-based methods often use only protein domain features.

View Article and Find Full Text PDF

To analyze the laboratory data by data mining, user-centered universal tools have not been available in medicine. We analyzed 1,565,877 laboratory data of 771 patients with viral hepatitis in order to find the difference of the temporal changes in laboratory test data between Hepatitis B and Hepatitis C by the combination of temporal abstraction and data mining. The data for one patient is temporal for more than 5 years.

View Article and Find Full Text PDF

The objective of this paper is twofold. One objective is to present a method of predicting signaling domain-domain interactions (signaling DDI) using inductive logic programming (ILP), and the other is to present a method of discovering signal transduction networks (STN) using signaling DDI. The research on computational methods for discovering signal transduction networks (STN) has received much attention because of the importance of STN to transmit inter- and intra-cellular signals.

View Article and Find Full Text PDF

Eukaryotic genomes are packaged by the wrapping of DNA around histone octamers to form nucleosomes. Nucleosome occupancy, acetylation, and methylation, which have a major impact on all nuclear processes involving DNA, have been recently mapped across the yeast genome using chromatin immunoprecipitation and DNA microarrays. However, this experimental protocol is laborious and expensive.

View Article and Find Full Text PDF

The high generalization ability of support vector machines (SVMs) has been shown in many practical applications, however, they are considerably slower in test phase than other learning approaches due to the possibly big number of support vectors comprised in their solution. In this letter, we describe a method to reduce such number of support vectors. The reduction process iteratively selects two nearest support vectors belonging to the same class and replaces them by a newly constructed one.

View Article and Find Full Text PDF

Motivation: Even in a simple organism like yeast Saccharomyces cerevisiae, transcription is an extremely complex process. The expression of sets of genes can be turned on or off by the binding of specific transcription factors to the promoter regions of genes. Experimental and computational approaches have been proposed to establish mappings of DNA-binding locations of transcription factors.

View Article and Find Full Text PDF

Tight turns have long been recognized as one of the three important features of proteins, together with alpha-helix and beta-sheet. Tight turns play an important role in globular proteins from both the structural and functional points of view. More than 90% tight turns are beta-turns and most of the rest are gamma-turns.

View Article and Find Full Text PDF

Tight turn has long been recognized as one of the three important features of proteins after the alpha-helix and beta-sheet. Tight turns play an important role in globular proteins from both the structural and functional points of view. More than 90% tight turns are beta-turns.

View Article and Find Full Text PDF

Unlabelled: In eukaryotes, gene expression is controlled by various transcription factors that bind to the promoter regions. Transcription factors may act positively, negatively or not at all. Different combinations of them may also activate or repress gene expression, and form regulatory networks of transcription.

View Article and Find Full Text PDF

In this paper, we propose a graph-based method to measure the similarity between chemical compounds described by 2D form. Our main idea is to measure the similarity between two compounds based on edges, nodes, and connectivity of their common subgraphs. We applied the proposed similarity measure in combination with a clustering method to more than eleven thousand compounds in the chemical compound database KEGG/LIGAND and discovered that compound clusters with highly similar structure compounds that share common names, take part in the same pathways, and have the same requirement of enzymes in reactions.

View Article and Find Full Text PDF