Background: For single-cell or metagenomic sequencing projects, it is necessary to sequence with a very high mean coverage in order to make sure that all parts of the sample DNA get covered by the reads produced. This leads to huge datasets with lots of redundant data. A filtering of this data prior to assembly is advisable. Brown et al. (2012) presented the algorithm Diginorm for this purpose, which filters reads based on the abundance of their k-mers.
Methods: We present Bignorm, a faster and quality-conscious read filtering algorithm. An important new algorithmic feature is the use of phred quality scores together with a detailed analysis of the k-mer counts to decide which reads to keep.
Results: We qualify and recommend parameters for our new read filtering algorithm. Guided by these parameters, we remove in terms of median 97.15% of the reads while keeping the mean phred score of the filtered dataset high. Using the SDAdes assembler, we produce assemblies of high quality from these filtered datasets in a fraction of the time needed for an assembly from the datasets filtered with Diginorm.
Conclusions: We conclude that read filtering is a practical and efficient method for reducing read data and for speeding up the assembly process. This applies not only for single cell assembly, as shown in this paper, but also to other projects with high mean coverage datasets like metagenomic sequencing projects. Our Bignorm algorithm allows assemblies of competitive quality in comparison to Diginorm, while being much faster. Bignorm is available for download at https://git.informatik.uni-kiel.de/axw/Bignorm .
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5496428 | PMC |
http://dx.doi.org/10.1186/s12859-017-1724-7 | DOI Listing |
Front Oncol
December 2024
Department of Urology, Second Affiliated Hospital of Nanchang University, Nanchang, China.
Background And Purpose: Distant metastasis in bladder cancer is linked to poor prognosis and significant mortality. Machine learning (ML), a key area of artificial intelligence, has shown promise in the diagnosis, staging, and treatment of bladder cancer. This study aimed to employ various ML techniques to predict distant metastasis in patients with bladder cancer.
View Article and Find Full Text PDFFront Neurol
December 2024
School of Medical Information Engineering, Guangzhou University of Chinese Medicine, Guangzhou, Guangdong, China.
Purpose: This study aims to develop a assessment system for evaluating shoulder joint muscle strength in patients with varying degrees of upper limb injuries post-stroke, using surface electromyographic (sEMG) signals and joint motion data.
Methods: The assessment system includes modules for acquiring muscle electromyography (EMG) signals and joint motion data. The EMG signals from the anterior, middle, and posterior deltoid muscles were collected, filtered, and denoised to extract time-domain features.
BMC Genomics
December 2024
School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, 333403, China.
Background: The subcellular localization of mRNA plays a crucial role in gene expression regulation and various cellular processes. However, existing wet lab techniques like RNA-FISH are usually time-consuming, labor-intensive, and limited to specific tissue types. Researchers have developed several computational methods to predict mRNA subcellular localization to address this.
View Article and Find Full Text PDFSci Rep
December 2024
Department of Electronics, Information and Communication Engineering, Kangwon National University, Samcheok, 25913, Republic of Korea.
Autism spectrum disorder (ASD) is a neurologic disorder considered to cause discrepancies in physical activities, social skills, and cognition. There is no specific medicine for treating this disorder; early intervention is critical to improving brain function. Additionally, the lack of a clinical test for detecting ASD makes diagnosis challenging.
View Article and Find Full Text PDFSci Rep
December 2024
College of Sciences, National University of Defense Technology, 410073, Changsha, China.
Deep Convolutional Neural Networks (DCNNs), due to their high computational and memory requirements, face significant challenges in deployment on resource-constrained devices. Network Pruning, an essential model compression technique, contributes to enabling the efficient deployment of DCNNs on such devices. Compared to traditional rule-based pruning methods, Reinforcement Learning(RL)-based automatic pruning often yields more effective pruning strategies through its ability to learn and adapt.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!