On the predictability of protein database search complexity and its relevance to optimization of distributed searches.

J Proteome Res

Diversa Corporation, San Diego, California 92121, USA.

Published: September 2007

We discuss several aspects related to load balancing of database search jobs in a distributed computing environment, such as Linux cluster. Load balancing is a technique for making the most of multiple computational resources, which is particularly relevant in environments in which the usage of such resources is very high. The particular case of the Sequest program is considered here, but the general methodology should apply to any similar database search program. We show how the runtimes for Sequest searches of tandem mass spectral data can be predicted from profiles of previous representative searches, and how this information can be used for better load balancing of novel data. A well-known heuristic load balancing method is shown to be applicable to this problem, and its performance is analyzed for a variety of search parameters.

Download full-text PDF	Source
http://dx.doi.org/10.1021/pr070066u	DOI Listing

Publication Analysis

Top Keywords

load balancing

database search

predictability protein

protein database

search complexity

complexity relevance

relevance optimization

optimization distributed

distributed searches

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!