Information about the interface sites of Protein-Protein Interactions (PPIs) is useful for many biological research works. However, despite the advancement of experimental techniques, the identification of PPI sites still remains as a challenging task. Using a statistical learning technique, we proposed a computational tool for predicting PPI interaction sites.
View Article and Find Full Text PDFThe type information of un-annotated membrane proteins provides an important hint for their biological functions. The experimental determination of membrane protein types, despite being more accurate and reliable, is not always feasible due to the costly laboratory procedures, thereby creating a need for the development of bioinformatics methods. This article describes a novel computational classifier for the prediction of membrane protein types using proteins' sequences.
View Article and Find Full Text PDFBackground: MicroRNAs (miRNAs) are small non-coding RNAs affecting the expression of target genes via translational repression or mRNA degradation mechanisms. With the increasing availability of mRNA and miRNA expression data, it might be possible to assess functional targets using the fact that a miRNA might down-regulate its target mRNAs. In this work we computed the correlation of expression profiles between miRNAs and target mRNAs using the NCI-60 expression data.
View Article and Find Full Text PDFBMC Bioinformatics
December 2008
Background: MicroRNAs (miRNAs) are a set of small non-coding RNAs serving as important negative gene regulators. In animals, miRNAs turn down protein translation by binding to the 3' UTR regions of target genes with imperfect complementary pairing. The identification of microRNA targets has become one of the major challenges of miRNA research.
View Article and Find Full Text PDFDetermining if missense mutations are deleterious is critical for the analysis of genes implicated in disease. However, the mutational effects of many missense mutations in databases like the Breast Cancer Information Core are unclassified. Several approaches have emerged recently to determine such mutational effects but none have utilized amino acid property indices.
View Article and Find Full Text PDFRemote homology detection refers to the detection of structure homology in evolutionarily related proteins with low sequence similarity. Supervised learning algorithms such as support vector machine (SVM) are currently the most accurate methods. In most of these SVM-based methods, efforts have been dedicated to developing new kernels to better use the pairwise alignment scores or sequence profiles.
View Article and Find Full Text PDFLike other cancers, aberrant gene regulation features significantly in hepatocellular carcinoma (HCC). MicroRNAs (miRNAs) were recently found to regulate gene expression at the post-transcriptional/translational levels. The expression profiles of 157 miRNAs were examined in 19 HCC patients, and 19 up-regulated and 3 down-regulated miRNAs were found to be associated with HCC.
View Article and Find Full Text PDFIdentifying a protein's subcellular localization is an important step to understand its function. However, the involved experimental work is usually laborious, time consuming and costly. Computational prediction hence becomes valuable to reduce the inefficiency.
View Article and Find Full Text PDFP53 is probably the most important tumor suppressor known. Over the years, information about this gene has increased dramatically. We have built a comprehensive knowledgebase of p53, which aims to facilitate wet-lab biologists to formulate their experiments and new-comers to learn whatever they need about the gene and bioinformaticians to make new discoveries through data analysis.
View Article and Find Full Text PDFPhysicochemcial properties of amino acids are important factors in determining protein structure and function. Most approaches make use of averaged properties over entire domains or even proteins to analyze their structure or function. This level of coarseness tends to hide the richness of the variability in the different properties across functional domains.
View Article and Find Full Text PDFBackground: The advent of genotype data from large-scale efforts that catalog the genetic variants of different populations have given rise to new avenues for multifactorial disease association studies. Recent work shows that genotype data from the International HapMap Project have a high degree of transferability to the wider population. This implies that the design of genotyping studies on local populations may be facilitated through inferences drawn from information contained in HapMap populations.
View Article and Find Full Text PDFBackground: The recent advancement in human genome sequencing and genotyping has revealed millions of single nucleotide polymorphisms (SNP) which determine the variation among human beings. One of the particular important projects is The International HapMap Project which provides the catalogue of human genetic variation for disease association studies. In this paper, we analyzed the genotype data in HapMap project by using National Institute of Environmental Health Sciences Environmental Genome Project (NIEHS EGP) SNPs.
View Article and Find Full Text PDFCMDWave (Conserved Motif Detection using WAVElets) is a web server that predicts conserved motifs in protein sequences. A set of query protein sequences are first aligned using ClustalW to obtain equal sized sequences. CMDWave then converts the sequences into a numerical representation using electron-ion interaction potential (EIIP).
View Article and Find Full Text PDFIn humans an estimated 35-60% of genes are alternatively spliced. A large number of genes also show alternative initiation or termination. Regulation of these processes is still poorly understood.
View Article and Find Full Text PDFBackground: Predicting the subcellular localization of proteins is important for determining the function of proteins. Previous works focused on predicting protein localization in Gram-negative bacteria obtained good results. However, these methods had relatively low accuracies for the localization of extracellular proteins.
View Article and Find Full Text PDFBackground: Protein subcellular localization is an important determinant of protein function and hence, reliable methods for prediction of localization are needed. A number of prediction algorithms have been developed based on amino acid compositions or on the N-terminal characteristics (signal peptides) of proteins. However, such approaches lead to a loss of contextual information.
View Article and Find Full Text PDFJ Bioinform Comput Biol
April 2005
We describe an exhaustive and greedy algorithm for improving the accuracy of multiple sequence alignment. A simple progressive alignment approach is employed to provide initial alignments. The initial alignment is then iteratively optimized against an objective function.
View Article and Find Full Text PDFJ Bioinform Comput Biol
February 2005
Tabu search is a meta-heuristic approach that is proven to be useful in solving combinatorial optimization problems. We implement the adaptive memory features of tabu search to refine a multiple sequence alignment. Adaptive memory helps the search process to avoid local optima and explores the solution space economically and effectively without getting trapped into cycles.
View Article and Find Full Text PDFUnlabelled: WebAllergen is a web server that predicts the potential allergenicity of proteins. The query protein will be compared against a set of prebuilt allergenic motifs that have been obtained from 664 known allergen proteins. The query will also be compared with known allergens that do not have detectable allergenic motifs.
View Article and Find Full Text PDFThe Y-box binding protein 1 (YB-1) regulates gene expression through transcription and translation. YB-1 has been shown to be associated with up-regulation of P-glycoprotein (Pgp), an ATP-binding transporter involved in multi-drug resistance. In this study, we determined the prognostic significance of YB-1 and its relationship with Pgp in patients with breast cancer.
View Article and Find Full Text PDFKi67 is a nuclear protein that is tightly linked to the cell cycle. It is a marker of cell proliferation and has been used to stratify good and poor prognostic categories in invasive breast cancer. Its correlation with gene expression patterns has not been fully elucidated.
View Article and Find Full Text PDFComput Biol Chem
April 2004
Multiple sequence alignment is a basic tool in computational genomics. The art of multiple sequence alignment is about placing gaps. This paper presents a heuristic algorithm that improves multiple protein sequences alignment iteratively.
View Article and Find Full Text PDFMotivation: With many transgenic proteins introduced today, the ability to predict their potential allergenicity has become an important issue. Previous studies were based on either sequence similarity or the protein motifs identified from known allergen databases. The similarity-based approaches, although being able to produce high recalls, usually have low prediction precisions.
View Article and Find Full Text PDFWe present an algorithm to detect protein sub-structural motifs from primary sequence. The input to the algorithm is a set of aligned multiple protein sequences. It uses wavelet transforms to decompose protein sequences represented numerically by different indices (such as polarity, accessible surface area or electron-ion integration potentials of the amino acids).
View Article and Find Full Text PDFUnlabelled: G-PRIMER, a web-based primer design program, has been developed to compute a minimal primer set specifically annealed to all the open reading frames in a given microbial genome. This program has been successfully used in the microarray experiment for analyzing the expression of genes in the Xanthomonas campestris genome.
Availability: It is available at http://mammoth.