Transcriptomes are one of the first sources of high-throughput genomic data that have benefitted from the introduction of Next-Gen Sequencing. As sequencing technology becomes more accessible, transcriptome sequencing is applicable to multiple organisms for which genome sequences are unavailable. Currently all methods for de novo assembly are based on the concept of matching the nucleotide context overlapping between short fragments-reads. However, even short reads may still contain biologically relevant information which can be used as hints in guiding the assembly process. We propose a computational workflow for the reconstruction and functional annotation of expressed gene transcripts that does not require a reference genome sequence and can be tolerant to low coverage, high error rates and other issues that often lead to poor results of de novo assembly in studies of non-model organisms. We start with either raw sequences or the output of a context-based de novo transcriptome assembly. Instead of mapping reads to a reference genome or creating a completely unsupervised clustering of reads, we assemble the unknown transcriptome using nearest homologs from a public database as seeds. We consider even distant relations, indirectly linking protein-coding fragments to entire gene families in multiple distantly related genomes. The intended application of the proposed method is an additional step of semantic (based on relations between protein-coding fragments) scaffolding following traditional (i.e. based on sequence overlap) de novo assembly. The method we developed was effective in analysis of the jellyfish Cyanea capillata transcriptome and may be applicable in other studies of gene expression in species lacking a high quality reference genome sequence. Our algorithms are implemented in C and designed for parallel computation using a high-performance computer. The software is available free of charge via an open source license.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4578894 | PMC |
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0138006 | PLOS |
Clin Epigenetics
January 2025
Centre for Genomic and Experimental Medicine, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
Alcohol consumption is an important risk factor for multiple diseases. It is typically assessed via self-report, which is open to measurement error through recall bias. Instead, molecular data such as blood-based DNA methylation (DNAm) could be used to derive a more objective measure of alcohol consumption by incorporating information from cytosine-phosphate-guanine (CpG) sites known to be linked to the trait.
View Article and Find Full Text PDFNat Commun
January 2025
Gene Regulation Laboratory, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK.
Individual enhancers are defined as short genomic regulatory elements, bound by transcription factors, and able to activate cell-specific gene expression at a distance, in an orientation-independent manner. Within mammalian genomes, enhancer-like elements may be found individually or within clusters referred to as locus control regions or super-enhancers (SEs). While these behave similarly to individual enhancers with respect to cell specificity, distribution and distance, their orientation-dependence has not been formally tested.
View Article and Find Full Text PDFMod Pathol
January 2025
Department of Pathology and Medical Biology, University Medical Center Groningen, Groningen, the Netherlands; Department of Pathology, Amsterdam University Medical Center, Amsterdam, the Netherlands. Electronic address:
Fibro-osseous tumors of the craniofacial bones are a heterogeneous group of lesions comprising cemento-osseous dysplasia (COD), cemento-ossifying fibroma (COF), juvenile trabecular ossifying fibroma (JTOF), psammomatoid ossifying fibroma (PsOF), fibrous dysplasia (FD), and low-grade osteosarcoma (LGOS) with overlapping clinicopathological features. However, their clinical behavior and treatment differ significantly, underlining the need for accurate diagnosis. Molecular diagnostic markers exist for subsets of these tumors, including GNAS mutations in FD, SATB2 fusions in PsOF, mutations involving the RAS-MAPK signaling pathway in COD, and MDM2 amplification in LGOS.
View Article and Find Full Text PDFDiagn Microbiol Infect Dis
January 2025
National Reference Laboratory of Control and Monitoring of Antibiotic Resistance (NRL-CMAR), Department Microbiology, National Center of Infectious and Parasitic Diseases (NCIPD), 26 Yanko Sakazov Blvd., Sofia, Bulgaria.
Increased incidence of Clostridioides difficile infections were documented in Bulgarian hospitals during COVID-19. WGS was performed on 39 isolates from seven hospitals during 2015-2022. Antimicrobial resistance and toxin genes were inferred from genomes.
View Article and Find Full Text PDFSci China Life Sci
January 2025
Institute of Rare Diseases, West China Hospital of Sichuan University, Sichuan University, Chengdu, 610000, China.
Genomic sources from China are underrepresented in the population-specific reference database. We performed whole-genome sequencing or genome-wide genotyping on 1,207 individuals from four linguistically diverse groups (1,081 Sinitic, 56 Mongolic, 40 Turkic, and 30 Tibeto-Burman people) living in North China included in the 10K Chinese People Genomic Diversity Project (10K_CPGDP) to characterize the genetic architecture and adaptative history of ethnic groups in the Silk Road Region of China. We observed a population split between Northwest Chinese minorities (NWCMs) and Han Chinese since the Upper Paleolithic and later Neolithic genetic differentiation within NWCMs.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!