Nanopore sequencing is regarded as one of the most promising third-generation sequencing (TGS) technologies. Since 2014, Oxford Nanopore Technologies (ONT) has developed a series of devices based on nanopore sequencing to produce very long reads, with an expected impact on genomics. However, the nanopore sequencing reads are susceptible to a fairly high error rate owing to the difficulty in identifying the DNA bases from the complex electrical signals. Although several basecalling tools have been developed for nanopore sequencing over the past years, it is still challenging to correct the sequences after applying the basecalling procedure. In this study, we developed an open-source DNA basecalling reviser, NanoReviser, based on a deep learning algorithm to correct the basecalling errors introduced by current basecallers provided by default. In our module, we re-segmented the raw electrical signals based on the basecalled sequences provided by the default basecallers. By employing convolution neural networks (CNNs) and bidirectional long short-term memory (Bi-LSTM) networks, we took advantage of the information from the raw electrical signals and the basecalled sequences from the basecallers. Our results showed NanoReviser, as a post-basecalling reviser, significantly improving the basecalling quality. After being trained on standard ONT sequencing reads from public and human NA12878 datasets, NanoReviser reduced the sequencing error rate by over 5% for both the dataset and the human dataset. The performance of NanoReviser was found to be better than those of all current basecalling tools. Furthermore, we analyzed the modified bases of the dataset and added the methylation information to train our module. With the methylation annotation, NanoReviser reduced the error rate by 7% for the dataset and specifically reduced the error rate by over 10% for the regions of the sequence rich in methylated bases. To the best of our knowledge, NanoReviser is the first post-processing tool after basecalling to accurately correct the nanopore sequences without the time-consuming procedure of building the consensus sequence. The NanoReviser package is freely available at https://github.com/pkubioinformatics/NanoReviser.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7434944 | PMC |
http://dx.doi.org/10.3389/fgene.2020.00900 | DOI Listing |
Pol J Vet Sci
December 2024
Department of Animal Nutrition and Husbandry, University of Veterinary Medicine and Pharmacy in Košice, Komenského 73, Košice, 04181, Slovakia.
The present study aimed to search for the presence of the plasmid-mediated antimicrobial resistance genes in 106 Escherichia coli (E. coli) isolates from a total of 240 fresh fecal samples collected from 12 private cattle farms in Bingol province of East Turkey from November 2021 to January 2022. In those colistin-resistant E.
View Article and Find Full Text PDFSynth Biol (Oxf)
December 2024
Claret Bioscience LLC, 100 Enterprise Way, Suite A102, Scotts Valley, CA 95066, United States.
In this study, we introduce a new method for oligonucleotide fragment assembly. Unlike polymerase chain assembly and ligase chain assembly that rely on short, highly purified oligonucleotides, our method, named , uses a one-tube, splint-driven assembly reaction. Splynthesis connects standard-desalted "contig" oligos (∼150 nt in length) via shorter "splint" oligos harboring 5' and 3' blocking modifications to prevent off-target ligation and amplification events.
View Article and Find Full Text PDFEpigenetics Chromatin
December 2024
Carol Davila University of Medicine and Pharmacy, Bucharest, Romania.
DNA methylation is an essential epigenetic mechanism for regulation of gene expression, through which many physiological (X-chromosome inactivation, genetic imprinting, chromatin structure and miRNA regulation, genome defense, silencing of transposable elements) and pathological processes (cancer and repetitive sequences-associated diseases) are regulated. Nanopore sequencing has emerged as a novel technique that can analyze long strands of DNA (long-read sequencing) without chemically treating the DNA. Interestingly, nanopore sequencing can also extract epigenetic status of the nucleotides (including both 5-Methylcytosine and 5-hydroxyMethylcytosine), and a large variety of bioinformatic tools have been developed for improving its detection properties.
View Article and Find Full Text PDFSci Data
December 2024
Department of Bio-Health Convergence, Kangwon National University, Chuncheon, 24341, Republic of Korea.
As molecular research on hemp (Cannabis sativa L.) continues to advance, there is a growing need for the accumulation of more diverse genome data and more accurate genome assemblies. In this study, we report the three-way assembly data of a cannabidiol (CBD)-rich cannabis variety, 'Pink Pepper' cultivar using sequencing technology: PacBio Single Molecule Real-Time (SMRT) technology, Illumina sequencing technology, and Oxford Nanopore Technology (ONT).
View Article and Find Full Text PDFWater Res
December 2024
Key Laboratory of Three Gorges Reservoir Region's Eco-environment, Ministry of Education, Chongqing University, Chongqing 400045, PR China; State Key Laboratory of Coal Mine Disaster Dynamics and Control, Chongqing University, Chongqing 400044, PR China. Electronic address:
As a byproduct of shale gas extraction, flowback water (FW) is produced in large quantities globally. Due to the unique interactions between pollutants and microorganisms, FW always harbor multiple antibiotic resistance genes (ARGs) that have been confirmed in our previous findings, potentially serving as a point source for ARGs released into the environment. However, whether ARGs in FW can disseminate or integrate into the environmental resistome remains unclear.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!