Motivation: The continuing exponential accumulation of full genome data, including full diploid human genomes, creates new challenges not only for understanding genomic structure, function and evolution, but also for the storage, navigation and privacy of genomic data. Here, we develop data structures and algorithms for the efficient storage of genomic and other sequence data that may also facilitate querying and protecting the data.
Results: The general idea is to encode only the differences between a genome sequence and a reference sequence, using absolute or relative coordinates for the location of the differences.
We have developed a computer system, MITOMASTER, to make analysis of human mitochondrial DNA (mtDNA) sequences efficient, accurate, and easily available. From imported sequences, the system identifies nucleotide variants, determines the haplogroup, rules out possible pseudogene contamination, identifies novel DNA sequence variants, and evaluates the potential biological significance of each variant. This system should be beneficial for mtDNA analyses of biomedical physicians and investigators, population biologists and forensic scientists.
View Article and Find Full Text PDFThe MITOMAP (http://www.mitomap.org) data system for the human mitochondrial genome has been greatly enhanced by the addition of a navigable mutational mitochondrial DNA (mtDNA) phylogenetic tree of approximately 3000 mtDNA coding region sequences plus expanded pathogenic mutation tables and a nuclear-mtDNA pseudogene (NUMT) data base.
View Article and Find Full Text PDFMITOMAP (http://www.MITOMAP.org), a database for the human mitochondrial genome, has grown rapidly in data content over the past several years as interest in the role of mitochondrial DNA (mtDNA) variation in human origins, forensics, degenerative diseases, cancer and aging has increased dramatically.
View Article and Find Full Text PDF