Publications by authors named "Kevin Galens"

Purpose: Exome sequencing (ES) is increasingly used for the diagnosis of rare genetic disease. However, some pathogenic sequence variants within the exome go undetected due to the technical difficulty of identifying them. Mobile element insertions (MEIs) are a known cause of genetic disease in humans but have been historically difficult to detect via ES and similar targeted sequencing methods.

View Article and Find Full Text PDF

Background: The benefit of increasing genomic sequence data to the scientific community depends on easy-to-use, scalable bioinformatics support. CloVR-Comparative combines commonly used bioinformatics tools into an intuitive, automated, and cloud-enabled analysis pipeline for comparative microbial genomics.

Results: CloVR-Comparative runs on annotated complete or draft genome sequences that are uploaded by the user or selected via a taxonomic tree-based user interface and downloaded from NCBI.

View Article and Find Full Text PDF

Massively parallel sequencing approaches are beginning to be used clinically to characterize individual patient tumors and to select therapies based on the identified mutations. A major question in these analyses is the extent to which these methods identify clinically actionable alterations and whether the examination of the tumor tissue alone is sufficient or whether matched normal DNA should also be analyzed to accurately identify tumor-specific (somatic) alterations. To address these issues, we comprehensively evaluated 815 tumor-normal paired samples from patients of 15 tumor types.

View Article and Find Full Text PDF

Unlabelled: For centuries, cholera has been one of the most feared diseases. The causative agent Vibrio cholerae is a waterborne Gram-negative enteric pathogen eliciting a severe watery diarrheal disease. In October 2010, the seventh pandemic reached Haiti, a country that had not experienced cholera for more than a century.

View Article and Find Full Text PDF

Cassava is a major tropical food crop in the Euphorbiaceae family that has high carbohydrate production potential and adaptability to diverse environments. Here we present the draft genome sequences of a wild ancestor and a domesticated variety of cassava and comparative analyses with a partial inbred line. We identify 1,584 and 1,678 gene models specific to the wild and domesticated varieties, respectively, and discover high heterozygosity and millions of single-nucleotide variations.

View Article and Find Full Text PDF

First identified in 1982, Escherichia coli O157:H7 is the dominant enterohemorrhagic serotype underlying food-borne human infections in North America. Here, we report the genomes of twenty-six strains derived from patients and the bovine reservoir. These resources enable detailed whole-genome comparisons and permit investigations of genotypic and phenotypic plasticity.

View Article and Find Full Text PDF

Background: Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software.

Results: We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources.

View Article and Find Full Text PDF

The Institute for Genome Sciences (IGS) has developed a prokaryotic annotation pipeline that is used for coding gene/RNA prediction and functional annotation of Bacteria and Archaea. The fully automated pipeline accepts one or many genomic sequences as input and produces output in a variety of standard formats. Functional annotation is primarily based on similarity searches and motif finding combined with a hierarchical rule based annotation system.

View Article and Find Full Text PDF

Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users.

View Article and Find Full Text PDF

Pathema (http://pathema.jcvi.org) is one of the eight Bioinformatics Resource Centers (BRCs) funded by the National Institute of Allergy and Infectious Disease (NIAID) designed to serve as a core resource for the bio-defense and infectious disease research community.

View Article and Find Full Text PDF

We present the genome sequences of a new clinical isolate of the important human pathogen, Aspergillus fumigatus, A1163, and two closely related but rarely pathogenic species, Neosartorya fischeri NRRL181 and Aspergillus clavatus NRRL1. Comparative genomic analysis of A1163 with the recently sequenced A. fumigatus isolate Af293 has identified core, variable and up to 2% unique genes in each genome.

View Article and Find Full Text PDF

YZGD from Paenibacillus thiaminolyticus is a novel bifunctional enzyme with both PLPase (pyridoxal phosphatase) and Nudix (nucleoside diphosphate x) hydrolase activities. The PLPase activity is catalysed by the HAD (haloacid dehalogenase) superfamily motif of the enzyme, and the Nudix hydrolase activity is catalysed by the conserved Nudix signature sequence within a separate portion of the enzyme, as confirmed by site-directed mutagenesis. YZGD's phosphatase activity is very specific, with pyridoxal phosphate being the only natural substrate, while YZGD's Nudix activity is just the opposite, with YZGD being the most versatile Nudix hydrolase characterized to date.

View Article and Find Full Text PDF