Unlabelled: Nucleotide Similarity Scanner (NSimScan) is specialized for searching massive DNA databases for distant similarities. Its targeted applications include phylogenomics, comparative and functional studies of non-coding sequences, contamination detection, etc. NSimScan outperforms industry standard tools in combined sensitivity, accuracy and speed, operating at sensitivity similar to BLAST, accuracy of ssearch and speed of MegaBLAST.
View Article and Find Full Text PDFIn the era of metagenomics and diagnostics sequencing, the importance of protein comparison methods of boosted performance cannot be overstated. Here we present PSimScan (Protein Similarity Scanner), a flexible open source protein similarity search tool which provides a significant gain in speed compared to BLASTP at the price of controlled sensitivity loss. The PSimScan algorithm introduces a number of novel performance optimization methods that can be further used by the community to improve the speed and lower hardware requirements of bioinformatics software.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
August 2006
Comparative analysis of 15 complete cyanobacterial genome sequences, including "near minimal" genomes of five strains of Prochlorococcus spp., revealed 1,054 protein families [core cyanobacterial clusters of orthologous groups of proteins (core CyOGs)] encoded in at least 14 of them. The majority of the core CyOGs are involved in central cellular functions that are shared with other bacteria; 50 core CyOGs are specific for cyanobacteria, whereas 84 are exclusively shared by cyanobacteria and plants and/or other plastid-carrying eukaryotes, such as diatoms or apicomplexans.
View Article and Find Full Text PDFGenome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp. israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B.
View Article and Find Full Text PDFThe ERGO (http://ergo.integratedgenomics.com/ERGO/) genome analysis and discovery suite is an integration of biological data from genomics, biochemistry, high-throughput expression profiling, genetics and peer-reviewed journals to achieve a comprehensive analysis of genes and genomes.
View Article and Find Full Text PDF