Improvements in genome sequencing techniques have resulted in generation of huge volumes of data. As a consequence of this progress, the genome assembly stage demands even more computational power, since the incoming sequence files contain large amounts of data. To speed up the process, it is often necessary to distribute the workload among a group of machines. However, this requires hardware and software solutions specially configured for this purpose. Grid computing try to simplify this process of aggregate resources, but do not always offer the best performance possible due to heterogeneity and decentralized management of its resources. Thus, it is necessary to develop software that takes into account these peculiarities. In order to achieve this purpose, we developed an algorithm aimed to optimize the functionality of de novo assembly software ABySS in order to optimize its operation in grids. We run ABySS with and without the algorithm we developed in the grid simulator SimGrid. Tests showed that our algorithm is viable, flexible, and scalable even on a heterogeneous environment, which improved the genome assembly time in computational grids without changing its quality.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3306921PMC
http://dx.doi.org/10.3389/fgene.2012.00038DOI Listing

Publication Analysis

Top Keywords

genome assembly
12
computational grids
8
scheduling algorithm
4
algorithm computational
4
grids minimizes
4
minimizes centralized
4
centralized processing
4
genome
4
processing genome
4
assembly
4

Similar Publications

Anchorage Accurately Assembles Anchor-Flanked Synthetic Long Reads.

Lebniz Int Proc Inform

August 2024

Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA.

Modern sequencing technologies allow for the addition of short-sequence tags, known as anchors, to both ends of a captured molecule. Anchors are useful in assembling the full-length sequence of a captured molecule as they can be used to accurately determine the endpoints. One representative of such anchor-enabled technology is LoopSeq Solo, a synthetic long read (SLR) sequencing protocol.

View Article and Find Full Text PDF

Assembly and comparative analysis of the complete mitogenome of var. , an exceptional berry plant possessing sweet leaves.

Front Plant Sci

December 2024

Zhejiang Provincial Key Laboratory of Plant Evolutionary Ecology and Conservation, College of Life Sciences, Taizhou University, Taizhou, China.

var. is a special berry plant of in the Rosaceae family. Its leaves contain high-sweetness, low-calorie, and non-toxic sweet ingredients, known as rubusoside.

View Article and Find Full Text PDF

Unlabelled: Eastern equine encephalitis virus (EEEV) is an arthropod-borne, positive-sense RNA alphavirus posing a substantial threat to public health. Unlike similar viruses such as SARS-CoV-2, EEEV replicates efficiently in neurons, producing progeny viral particles as soon as 3-4 hours post-infection. EEEV infection, which can cause severe encephalitis with a human mortality rate surpassing 30%, has no licensed, targeted therapies, leaving patients to rely on supportive care.

View Article and Find Full Text PDF

Structural variants (SVs) drive gene expression in the human brain and are causative of many neurological conditions. However, most existing genetic studies have been based on short-read sequencing methods, which capture fewer than half of the SVs present in any one individual. Long-read sequencing (LRS) enhances our ability to detect disease-associated and functionally relevant structural variants (SVs); however, its application in large-scale genomic studies has been limited by challenges in sample preparation and high costs.

View Article and Find Full Text PDF

Somatic mutations in individual cells lead to genomic mosaicism, contributing to the intricate regulatory landscape of genetic disorders and cancers. To evaluate and refine the detection of somatic mosaicism across different technologies with personalized donor-specific assembly (DSA), we obtained tissue from the dorsolateral prefrontal cortex (DLPFC) of a post-mortem neurotypical 31-year-old individual. We sequenced bulk DLPFC tissue using Oxford Nanopore Technologies (∼60X), NovaSeq (∼30X), and linked-read sequencing (∼28X).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!