K-mer-based analysis plays an important role in many bioinformatics applications, such as de novo assembly, sequencing error correction, and genotyping. To take full advantage of such methods, the k-mer content of a read set must be captured as accurately as possible. Often the use of long k-mers is preferred because they can be uniquely associated with a specific genomic region. Unfortunately, it is not possible to reliably extract long k-mers in high error rate reads with standard exact k-mer counting methods. We propose SAKE, a method to extract long k-mers from high error rate reads by utilizing strobemers and consensus k-mer generation through partial order alignment. Our experiments show that on simulated data with up to 6% error rate, SAKE can extract 97-mers with over 90% recall. Conversely, the recall of DSK, an exact k-mer counter, drops to less than 20%. Furthermore, the precision of SAKE remains similar to DSK. On real bacterial data, SAKE retrieves 97-mers with a recall of over 90% and slightly lower precision than DSK, while the recall of DSK already drops to 50%. We show that SAKE can extract more k-mers from uncorrected high error rate reads compared to exact k-mer counting. However, exact k-mer counters run on corrected reads can extract slightly more k-mers than SAKE run on uncorrected reads.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10686461 | PMC |
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0294415 | PLOS |
Qual Manag Health Care
January 2025
Author Affiliations: Source Healthcare, Santa Monica, California.
Background And Objectives: Retrospective studies examining errors within a surgical scheduling setting do not fully represent the effects of human error involved in transcribing critical patient health information (PHI). These errors can negatively impact patient care and reduce workplace efficiency due to insurance claim denials and potential sentinel events. Previous reports underscore the burden physicians face with prior authorizations which may lead to serious adverse events or the abandonment of treatment due to these delays.
View Article and Find Full Text PDFPLoS One
January 2025
Engineering Research Center of Hydrogen Energy Equipment& Safety Detection, Universities of Shaanxi Province, Xijing University, Xi'an, China.
The traditional method of corn quality detection relies heavily on the subjective judgment of inspectors and suffers from a high error rate. To address these issues, this study employs the Swin Transformer as an enhanced base model, integrating machine vision and deep learning techniques for corn quality assessment. Initially, images of high-quality, moldy, and broken corn were collected.
View Article and Find Full Text PDFStat Med
February 2025
Villanova University, Villanova, Pennsylvania, USA.
We study the problem of testing multiple secondary endpoints conditional on a primary endpoint being significant in a two-stage group sequential procedure, focusing on two secondary endpoints. This extends our previous work with one secondary endpoint. The test for the secondary null hypotheses is a closed procedure.
View Article and Find Full Text PDFNurs Rep
January 2025
Department of Medical Informatics and Management, University Hospital, University of Occupational and Environmental Health, Kitakyusyu 807-8555, Japan.
: Medication errors cause adverse events; however, studies have yet to examine medication errors related to nursing hours while considering ward characteristics in Japan. Purpose: This study investigated medication errors caused by nurses to quantitatively assess ward activity as busyness in nursing duties. : This study considered patients hospitalized in the general wards of 10 National Hospital Organization institutions between April 2019 and March 2020.
View Article and Find Full Text PDFMetabolites
January 2025
Department of Biostatistics & Informatics, Colorado School of Public Heath, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA.
Background: Due to scientific advancements in high-throughput data production technologies, omics studies, such as genomics and metabolomics, often give rise to numerous measurements per sample/subject containing several noisy variables that potentially cloud the true signals relevant to the desired study outcome(s). Therefore, correcting for multiple testing is critical while performing any statistical test of significance to minimize the chances of false or missed discoveries. Such correction practice is commonplace in genome-wide association studies (GWAS) but is also becoming increasingly relevant to metabolome-wide association studies (MWAS).
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!