To address the monumental challenge of assigning function to millions of sequenced proteins, we completed the first of a kind all-versus-all sequence alignments using BLAST for 9.9 million proteins in the UniRef100 database. Microsoft Windows Azure produced over 3 billion filtered records in 6 days using 475 eight-core virtual machines.
View Article and Find Full Text PDFThis article is a summary of the bioinformatics issues and challenges of data-intensive science as discussed in the NSF-funded Data-Intensive Science (DIS) workshop in Seattle, September 19-20, 2010.
View Article and Find Full Text PDF