Sequence assembly using next generation sequencing data--challenges and solutions.

Francis Y L Chin Henry C M Leung S M Yiu

Sci China Life Sci

Department of Computer Science, The University of Hong Kong, Hong Kong, China,

Published: November 2014

Sequence assembling is an important step for bioinformatics study. With the help of next generation sequencing (NGS) technology, high throughput DNA fragment (reads) can be randomly sampled from DNA or RNA molecular sequence. However, as the positions of reads being sampled are unknown, assembling process is required for combining overlapped reads to reconstruct the original DNA or RNA sequence. Compared with traditional Sanger sequencing methods, although the throughput of NGS reads increases, the read length is shorter and the error rate is higher. It introduces several problems in assembling. Moreover, paired-end reads instead of single-end reads can be sampled which contain more information. The existing assemblers cannot fully utilize this information and fails to assemble longer contigs. In this article, we will revisit the major problems of assembling NGS reads on genomic, transcriptomic, metagenomic and metatranscriptomic data. We will also describe our IDBA package for solving these problems. IDBA package has adopted several novel ideas in assembling, including using multiple k, local assembling and progressive depth removal. Compared with existence assemblers, IDBA has better performance on many simulated and real sequencing datasets.

Download full-text PDF	Source
http://dx.doi.org/10.1007/s11427-014-4752-9	DOI Listing

Publication Analysis

Top Keywords

generation sequencing

dna rna

reads sampled

ngs reads

problems assembling

idba package

reads

assembling

sequence

sequence assembly

Similar Publications

Lipid nanoparticles deliver DNA-encoded biologics and induce potent protective immunity.

Mol Cancer

January 2025

Department of Medicine, Section of Epidemiology and Population Sciences, Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA.

Dafei Chai Junhao Wang Jing Ming Lim Xiaohui Xie Xinfang Yu

Lipid nanoparticles (LNPs) for mRNA delivery have advanced significantly, but LNP-mediated DNA delivery still faces clinical challenges. This study compared various LNP formulations for delivering DNA-encoded biologics, assessing their expression efficacy and the protective immunity generated by LNP-encapsulated DNA in different models. The LNP formulation used in Moderna's Spikevax mRNA vaccine (LNP-M) demonstrated a stable nanoparticle structure, high expression efficiency, and low toxicity.

View Article and Find Full Text PDF

Similar Publications

The microenvironment cell index is a novel indicator for the prognosis and therapeutic regimen selection of cancers.

J Transl Med

January 2025

Department of Stem Cell and Regenerative Medicine, Southwest Cancer Center, Southwest Hospital, Third Military Medical University (Army Medical University), Chongqing, 400038, China.

Xian-Yan Yang Nian Chen Qian Wen Yu Zhou Tao Zhang

Background: It is worthwhile to establish a prognostic prediction model based on microenvironment cells (MCs) infiltration and explore new treatment strategies for triple-negative breast cancer (TNBC).

Methods: The xCell algorithm was used to quantify the cellular components of the TNBC microenvironment based on bulk RNA sequencing (bulk RNA-seq) data. The MCs index (MCI) was constructed using the least absolute shrinkage and selection operator Cox (LASSO-Cox) regression analysis.

View Article and Find Full Text PDF

Similar Publications

Impacts of genomic alterations on the efficacy of HER2-targeted antibody-drug conjugates in patients with metastatic breast cancer.

J Transl Med

January 2025

State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-Sen University Cancer Center, No.651 Dongfeng East Road, Guangzhou, 510060, People's Republic of China.

Riqing Huang Anqi Hu Qixiang Rong Ditian Shu Meiting Chen

Background: HER2-targeted antibody-drug conjugates (ADCs) have revolutionized the treatment landscape of metastatic breast cancer. However, the efficacy of these therapies may be compromised by genomic alterations. Hence, this study aims to identify factors predicting sensitivity to HER2 ADC in metastatic breast cancer.

View Article and Find Full Text PDF

Similar Publications

"Sichuanvirus", a novel bacteriophage viral genus, able to lyse carbapenem-resistant Klebsiella pneumoniae.

BMC Microbiol

January 2025

Center of Infectious Diseases, West China Hospital, Sichuan University, Guoxuexiang 37, Chengdu, 610041, China.

Juan Li Qingqing Fang Huan Luo Yan Feng Yu Feng

Background: Carbapenem-resistant Klebsiella pneumoniae (CRKP) is a severe threat for human health and urgently needs new therapeutic approaches. Lytic bacteriophages (phages) are promising clinically viable therapeutic options against CRKP. We attempted to isolate lytic phages against CRKP of sequence type 11 and capsular type 64 (ST11-KL64), the predominant type in China.

View Article and Find Full Text PDF

Similar Publications

Solu: a cloud platform for real-time genomic pathogen surveillance.

BMC Bioinformatics

January 2025

Solu Healthcare Oy, Kalevankatu 31 A 13, 00100, Helsinki, Finland.

Timo Saratto Kerkko Visuri Jonatan Lehtinen Irene Ortega-Sanz Jacob L Steenwyk

Background: Genomic surveillance is extensively used for tracking public health outbreaks and healthcare-associated pathogens. Despite advancements in bioinformatics pipelines, there are still significant challenges in terms of infrastructure, expertise, and security when it comes to continuous surveillance. The existing pipelines often require the user to set up and manage their own infrastructure and are not designed for continuous surveillance that demands integration of new and regularly generated sequencing data with previous analyses.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!