Motivation: Whole genome alignment of eukaryote species remains an important method for the determination of sequence and structural variations and can also be used to ascertain the representative non-redundant core-genome sequence of a population. Many whole genome alignment tools were first developed for the more mature analysis of prokaryote species with few current tools containing the functionality to process larger genomes of eukaryotes as well as genomes of more divergent species. In addition, the functionality of these tools becomes computationally prohibitive due to the significant compute resources needed to handle larger genomes.
Results: In this research, we present CoreDetector, an easy-to-use general-purpose program that can align the core-genome sequences for a range of genome sizes and divergence levels. To illustrate the flexibility of CoreDetector, we conducted alignments of a large set of closely related fungal pathogen and hexaploid wheat cultivar genomes as well as more divergent fly and rodent species genomes. In all cases, compared to existing multiple genome alignment tools, CoreDetector exhibited improved flexibility, efficiency, and competitive accuracy in tested cases.
Availability And Implementation: CoreDetector was developed in the cross platform, and easily deployable, Java language. A packaged pipeline is readily executable in a bash terminal without any external need for Perl or Python environments. Installation, example data, and usage instructions for CoreDetector are freely available from https://github.com/mfruzan/CoreDetector.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10663985 | PMC |
http://dx.doi.org/10.1093/bioinformatics/btad628 | DOI Listing |
J Am Coll Surg
January 2025
Department of Surgery, University of Kentucky Medical Center, Lexington, KY.
Background: Colon cancer is a leading cause of mortality in Appalachian Kentucky. Studies suggest that the microbiome may influence cancer outcomes. We investigate differential gene expression, the tumor microbiome, and the association between the two as potential drivers of disparities in colon cancer outcomes.
View Article and Find Full Text PDFPlant J
January 2025
College of Horticulture, Bioinformatics Center, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, 210095, China.
The traditional Chinese medicinal plant Prunella vulgaris contains numerous triterpene saponin metabolites, notably ursolic and oleanolic acid saponins, which have significant pharmacological values. Despite their importance, the genes responsible for synthesizing these triterpene saponins in P. vulgaris remain unidentified.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA.
Genome graphs, including the recently released draft human pangenome graph, can represent the breadth of genetic diversity and thus transcend the limits of traditional linear reference genomes. However, there are no genome-graph-compatible tools for analyzing whole genome bisulfite sequencing (WGBS) data. To close this gap, we introduce methylGrapher, a tool tailored for accurate DNA methylation analysis by mapping WGBS data to a genome graph.
View Article and Find Full Text PDFinfects the urogenital tract of men and women and causes the sexually transmitted infection trichomoniasis. Since the publication of its draft genome in 2007, the genome has drawn attention for several reasons, including its unusually large size, massive expansion of gene families, and high repeat content. The fragmented nature of the draft assembly made it challenging to obtain accurate metrics of features, such as spliceosomal introns.
View Article and Find Full Text PDFPNAS Nexus
January 2025
Logic of Genomic Systems Laboratory (CNB-CSIC), Madrid E-28049, Spain.
While more data are becoming available on gene activity at different levels of biological organization, our understanding of the underlying biology remains incomplete. Here, we introduce a metabolic efficiency framework that considers highly expressed proteins (HEPs), their length, and biosynthetic costs in terms of the amino acids (AAs) they contain to address the observed balance of expression costs in cells, tissues, and cancer transformation. Notably, the combined set of HEPs in either cells or tissues shows an abundance of large and costly proteins, yet tissues compensate this with short HEPs comprised of economical AAs, indicating a stronger tendency toward mitigating costs.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!