Publications by authors named "Etsuji Tomita"

Many kinds of tree-structured data, such as RNA secondary structures, have become available due to the progress of techniques in the field of molecular biology. To analyze the tree-structured data, various measures for computing the similarity between them have been developed and applied. Among them, tree edit distance is one of the most widely used measures.

View Article and Find Full Text PDF

Background: Measuring similarities between tree structured data is important for analysis of RNA secondary structures, phylogenetic trees, glycan structures, and vascular trees. The edit distance is one of the most widely used measures for comparison of tree structured data. However, it is known that computation of the edit distance for rooted unordered trees is NP-hard.

View Article and Find Full Text PDF

Background: Progress in the life sciences cannot be made without integrating biomedical knowledge on numerous genes in order to help formulate hypotheses on the genetic mechanisms behind various biological phenomena, including diseases. There is thus a strong need for a way to automatically and comprehensively search from biomedical databases for related genes, such as genes in the same families and genes encoding components of the same pathways. Here we address the extraction of related genes by searching for densely-connected subgraphs, which are modeled as cliques, in a biomedical relational graph.

View Article and Find Full Text PDF

In this paper, we present several methods for computing a solution to the protein side chain packing problem, with all methods having a common solution approach of breaking the polymer into subpolymers and using maximum edge weight cliques to prune the search space for the optimal side chain packing. We characterize the graph sizes generated for each method and compare their prediction accuracies. These methods are demonstrated for computing proteins up to approximately 8000 residues.

View Article and Find Full Text PDF

With the advent of experimental technologies like chemical cross-linking, it has become possible to obtain distances between specific residues of a newly sequenced protein. These types of experiments usually are less time consuming than X-ray crystallography or NMR. Consequently, it is highly desired to develop a method that incorporates this distance information to improve the performance of protein threading methods.

View Article and Find Full Text PDF

"Protein Side-chain Packing" has an ever-increasing application in the field of bio-informatics, dating from the early methods of homology modeling to protein design and to the protein docking. However, this problem is computationally known to be NP-hard. In this regard, we have developed a novel approach to solve this problem using the notion of a maximum edge-weight clique.

View Article and Find Full Text PDF

We developed maximum clique-based algorithms for spot matching for two-dimensional gel electrophoresis images, protein structure alignment and protein side-chain packing, where these problems are known to be NP-hard. Algorithms based on direct reductions to the maximum clique can find optimal solutions for instances of size (the number of points or residues) up to 50-150 using a standard PC. We also developed pre-processing techniques to reduce the sizes of graphs.

View Article and Find Full Text PDF