Objectives: Genomic signatures like -mers have become one of the most prominent approaches to describe genomic data. As a result, myriad real-world applications, such as the construction of de Bruijn graphs in genome assembly, have been benefited by recognizing genomic signatures. In other words, an efficient approach of genomic signature profiling is an essential need for tackling high-throughput sequencing reads. However, most of the existing approaches only recognize fixed-size -mers while many research studies have shown the importance of considering variable-length -mers.
Methods: In this paper, we present a novel genomic signature profiling approach, TahcoRoll, by extending the Aho-Corasick algorithm (AC) for the task of profiling variable-length -mers. We first group nucleotides into two clusters and represent each cluster with a bit. The rolling hash technique is further utilized to encode signatures and read patterns for efficient matching.
Results: In extensive experiments, TahcoRoll significantly outperforms the most state-of-the-art -mer counters and has the capability of processing reads across different sequencing platforms on a budget desktop computer.
Conclusions: The single-thread version of TahcoRoll is as efficient as the eight-thread version of the state-of-the-art, JellyFish, while the eight-thread TahcoRoll outperforms the eight-thread JellyFish by at least four times.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9027990 | PMC |
http://dx.doi.org/10.1515/mr-2021-0016 | DOI Listing |
Front Pharmacol
January 2025
Human Molecular Genetics Research Unit, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Poznan, Poland.
Cardiovascular diseases (CVDs) include atherosclerosis, which is an inflammatory disease of large and medium vessels that leads to atherosclerotic plaque formation. The key factors contributing to the onset and progression of atherosclerosis include the pro-inflammatory cytokines interferon (IFN)α and IFNγ and the pattern recognition receptor (PRR) Toll-like receptor 4 (TLR4). Together, they trigger the activation of IFN regulatory factors (IRFs) and signal transducer and activator of transcription (STAT)s.
View Article and Find Full Text PDFWellcome Open Res
December 2024
Human Genetics, Wellcome Sanger Institute, Hinxton, England, CB10 1SA, UK.
Birth cohort studies involve repeated surveys of large numbers of individuals from birth and throughout their lives. They collect information useful for a wide range of life course research domains, and biological samples which can be used to derive data from an increasing collection of omic technologies. This rich source of longitudinal data, when combined with genomic data, offers the scientific community valuable insights ranging from population genetics to applications across the social sciences.
View Article and Find Full Text PDFWorld J Gastroenterol
January 2025
Department of Oncology Surgery, Cell Therapy and Organ Transplantation, Institute of Biomedicine of Seville, Virgen del Rocio University Hospital, Seville 41013, Spain.
Background: Hepatocellular carcinoma (HCC) is the most common subtype of primary liver cancer with varied incidence and epidemiology worldwide. Sorafenib is still a recommended treatment for a large proportion of patients with advanced HCC. Different patterns of treatment responsiveness have been identified in differentiated hepatoblastoma HepG2 cells and metastatic HCC SNU449 cells.
View Article and Find Full Text PDFVirus Evol
January 2025
MRC-University of Glasgow Centre for Virus Research, The University of Glasgow, Glasgow G61 1QH, United Kingdom.
Anelloviruses are a group of small, circular, single-stranded DNA viruses that are found ubiquitously across mammalian hosts. Here, we explored a large number of publicly available human microbiome datasets and retrieved a total of 829 anellovirus genomes, substantially expanding the known diversity of these viruses. The majority of new genomes fall within the three major human anellovirus genera: , and , while we also present new genomes of the under-sampled , and genera.
View Article and Find Full Text PDFFront Med (Lausanne)
January 2025
Department of Thoracic Surgery, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China.
Background: The lysosome plays a vitally crucial role in tumor development and is a major participant in the cell death process, involving aberrant functional and structural changes. However, there are few studies on lysosome-associated genes (LAGs) in lung adenocarcinoma (LUAD).
Methods: Bulk RNA-seq of LUAD was downloaded from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO).
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!