Publications by authors named "Christopher S Oehmen"

BLAST remains one of the most widely used tools in computational biology. The rate at which new sequence data is available continues to grow exponentially, driving the emergence of new fields of biological research. At the same time, multicore systems and conventional clusters are more accessible.

View Article and Find Full Text PDF

The dengue viruses (DENVs) exist as numerous genetic strains that are grouped into four antigenically distinct serotypes. DENV strains from each serotype can cause severe disease and threaten public health in tropical and subtropical regions worldwide. No licensed antiviral agent to treat DENV infections is currently available, and there is an acute need for the development of novel therapeutics.

View Article and Find Full Text PDF

Systems biology attempts to reconcile large amounts of disparate data with existing knowledge to provide models of functioning biological systems. The cyanobacterium Cyanothece sp. ATCC 51142 is an excellent candidate for such systems biology studies because: (i) it displays tight functional regulation between photosynthesis and nitrogen fixation; (ii) it has robust cyclic patterns at the genetic, protein and metabolomic levels; and (iii) it has potential applications for bioenergy production and carbon sequestration.

View Article and Find Full Text PDF

Motivation: The standard approach to identifying peptides based on accurate mass and elution time (AMT) compares profiles obtained from a high resolution mass spectrometer to a database of peptides previously identified from tandem mass spectrometry (MS/MS) studies. It would be advantageous, with respect to both accuracy and cost, to only search for those peptides that are detectable by MS (proteotypic).

Results: We present a support vector machine (SVM) model that uses a simple descriptor space based on 35 properties of amino acid content, charge, hydrophilicity and polarity for the quantitative prediction of proteotypic peptides.

View Article and Find Full Text PDF

Background: The challenge of remote homology detection is that many evolutionarily related sequences have very little similarity at the amino acid level. Kernel-based discriminative methods, such as support vector machines (SVMs), that use vector representations of sequences derived from sequence properties have been shown to have superior accuracy when compared to traditional approaches for the task of remote homology detection.

Results: We introduce a new method for feature vector representation based on the physicochemical properties of the primary protein sequence.

View Article and Find Full Text PDF

Due to the exponential growth of sequenced genomes, the need to quickly provide accurate annotation for existing and new sequences is paramount to facilitate biological research. Current sequence comparison approaches fail to detect homologous relationships when sequence similarity is low. Support vector machine (SVM) algorithms approach this problem by transforming all proteins into a feature space of equal dimension based on protein properties, such as sequence similarity scores against a basis set of proteins or motifs.

View Article and Find Full Text PDF

Motivation: The standard approach to identifying peptides based on accurate mass and elution time (AMT) compares profiles obtained from a high resolution mass spectrometer to a database of peptides previously identified from tandem mass spectrometry (MS/MS) studies. It would be advantageous, with respect to both accuracy and cost, to only search for those peptides that are detectable by MS (proteotypic).

Results: We present a support vector machine (SVM) model that uses a simple descriptor space based on 35 properties of amino acid content, charge, hydrophilicity and polarity for the quantitative prediction of proteotypic peptides.

View Article and Find Full Text PDF

Motivation: As the amount of biological sequence data continues to grow exponentially we face the increasing challenge of assigning function to this enormous molecular 'parts list'. The most popular approaches to this challenge make use of the simplifying assumption that similar functional molecules, or proteins, sometimes have similar composition, or sequence. However, these algorithms often fail to identify remote homologs (proteins with similar function but dissimilar sequence) which often are a significant fraction of the total homolog collection for a given sequence.

View Article and Find Full Text PDF

Unlabelled: The visual Platform for Proteomics Peptide and Protein data exploration (PQuad) is a multi-resolution environment that visually integrates genomic and proteomic data for prokaryotic systems, overlays categorical annotation and compares differential expression experiments. PQuad requires Java 1.5 and has been tested to run across different operating systems.

View Article and Find Full Text PDF

A significant challenge in homology detection is to identify sequences that share a common evolutionary ancestor, despite significant primary sequence divergence. Remote homologs will often have less than 30% sequence identity, yet still retain common structural and functional properties. We demonstrate a novel method for identifying remote homologs using a support vector machine (SVM) classifier trained by fusing sequence similarity scores and subcellular location prediction.

View Article and Find Full Text PDF

We evaluate statistical models used in two-hypothesis tests for identifying peptides from tandem mass spectrometry data. The null hypothesis H(0), that a peptide matches a spectrum by chance, requires information on the probability of by-chance matches between peptide fragments and peaks in the spectrum. Likewise, the alternate hypothesis H(A), that the spectrum is due to a particular peptide, requires probabilities that the peptide fragments would indeed be observed if it was the causative agent.

View Article and Find Full Text PDF

Introduction: A rapidly activating delayed rectifier potassium current (I(Kr)) is known to have an important role in determining the properties of spontaneous pacing in enzymatically isolated rabbit sinoatrial node (SAN) cells. The functional characteristics of I(Kr) are conferred by its dependence on time, voltage, and external potassium. The aim of this study was to develop a rigorous mathematical representation for I(Kr) based on experimental findings and to investigate the role of I(Kr) in the automaticity and intercellular communication of SAN cells.

View Article and Find Full Text PDF