A fundamental operation in computational genomics is to reduce the input sequences to their constituent k-mers. For maximum performance of downstream applications it is important to store the k-mers in small space, while keeping the representation easy and efficient to use (i.e. without k-mer repetitions and in plain text). Recently, heuristics were presented to compute a near-minimum such representation. We present an algorithm to compute a minimum representation in optimal (linear) time and use it to evaluate the existing heuristics. Our algorithm first constructs the de Bruijn graph in linear time and then uses a Eulerian-cycle-based algorithm to compute the minimum representation, in time linear in the size of the output.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9949180PMC
http://dx.doi.org/10.21203/rs.3.rs-2581995/v1DOI Listing

Publication Analysis

Top Keywords

linear time
12
plain text
8
algorithm compute
8
compute minimum
8
minimum representation
8
representation
5
eulertigs minimum
4
minimum plain
4
text representation
4
representation k-mer
4

Similar Publications

and mosquitoes, known for spreading arboviruses like dengue and West Nile, thrive in cities, posing health risks to urban populations. Climate change can create suitable climatic conditions for these vectors to spread further in Europe. Cities contain numerous landscape and infrastructure elements, such as storm drains, that allow stagnant water build-up facilitating mosquito breeding.

View Article and Find Full Text PDF

Objective: To examine the influence of latitude, longitude, sunrise, and daylight, in conjunction with individual and behavioral factors, on sleep duration, wake time, and bedtime in a country with the world's broadest latitude range, yet characterized by homogeneity in language, cultural traits, and consistent time zones.

Methods: Participants (n = 1440; 18-65y) were part of a virtual population-based survey (2021-22). Sleep patterns were spatially represented through maps using Multilevel B-spline Interpolation.

View Article and Find Full Text PDF

Background: Previous studies have found an association between influenza, cardiovascular and cerebrovascular disease mortality, and all-cause mortality. And the vaccination of elderly diabetes is often recommended to reduce the risk of hospitalization and death. Nevertheless, no previous work has investigated the short-term impact of influenza on diabetes mortality in China.

View Article and Find Full Text PDF

Background: The safety and efficacy of endoscopic sinus surgery have improved with the development of new equipment and improved surgical techniques. However, it is accompanied by the risk of complications. Intraoperative blood loss is an important factor in the safe conduct of surgery.

View Article and Find Full Text PDF

Heparin (HEP) is one of the oldest anticoagulant drugs, widely used in clinical settings, particularly in surgery and dialysis machines. Despite its long history, it remains extensively employed in medical practice. This study introduces a selective and cost-effective method for the rapid detection of HEP using red-emission carbon dots (R-CDs).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!