Background: The evolution of next-generation sequencing (NGS) technologies has led to increased focus on RNA-Seq. Many bioinformatic tools have been developed for RNA-Seq analysis, each with unique performance characteristics and configuration parameters. Users face an increasingly complex task in understanding which bioinformatic tools are best for their specific needs and how they should be configured. In order to provide some answers to these questions, we investigate the performance of leading bioinformatic tools designed for RNA-Seq analysis and propose a methodology for systematic evaluation and comparison of performance to help users make well informed choices.
Results: To evaluate RNA-Seq pipelines, we developed a suite of two benchmarking tools. SimCT generates simulated datasets that get as close as possible to specific real biological conditions accompanied by the list of genomic incidents and mutations that have been inserted. BenchCT then compares the output of any bioinformatics pipeline that has been run against a SimCT dataset with the simulated genomic and transcriptional variations it contains to give an accurate performance evaluation in addressing specific biological question. We used these tools to simulate a real-world genomic medicine question s involving the comparison of healthy and cancerous cells. Results revealed that performance in addressing a particular biological context varied significantly depending on the choice of tools and settings used. We also found that by combining the output of certain pipelines, substantial performance improvements could be achieved.
Conclusion: Our research emphasizes the importance of selecting and configuring bioinformatic tools for the specific biological question being investigated to obtain optimal results. Pipeline designers, developers and users should include benchmarking in the context of their biological question as part of their design and quality control process. Our SimBA suite of benchmarking tools provides a reliable basis for comparing the performance of RNA-Seq bioinformatics pipelines in addressing a specific biological question. We would like to see the creation of a reference corpus of data-sets that would allow accurate comparison between benchmarks performed by different groups and the publication of more benchmarks based on this public corpus. SimBA software and data-set are available at http://cractools.gforge.inria.fr/softwares/simba/ .
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5623974 | PMC |
http://dx.doi.org/10.1186/s12859-017-1831-5 | DOI Listing |
Heliyon
January 2025
Biotechnology Unit, ICAR-Central Research Institute for Jute and Allied Fibres, Barrackpore, Kolkata, West Bengal, 700121, India.
Recent advances in genome editing tools and CRISPR-Cas technologies have enabled plant genome engineering reach new heights. The current regulatory exemptions for certain categories of genome edited products, such as those derived from SDN-1 and SDN-2, which are free of any transgene, have significantly accelerated genome editing research in a number of agricultural crop plants in different countries. Although CRISPR-Cas technology is becoming increasingly popular, it is still important to carefully consider a number of factors before planning and carrying conducting CRISPR-Cas studies.
View Article and Find Full Text PDFClin Genet
January 2025
Human Molecular Genetics Group, National Health Commission (NHC), Key Laboratory of Molecular Probes and Targeted Diagnosis and Therapy, Harbin Medical University, Harbin, China.
The pathogenicity of cholestatic liver diseases (CLDs) remains insufficiently characterized, hindering definitive diagnosis and timely treatment. The aim of this study was to improve the pathogenicity prediction of novel bile acid (BA) transporter variants in patients with CLDs. We analyzed the clinical characteristics and genetic profiles of a CLD cohort (n = 57) using multiple in silico tools and in vitro functional assays.
View Article and Find Full Text PDFEnviron Microbiol
January 2025
Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA.
Ecological assembly-the process of ecological community formation through species introductions-has recently seen exciting theoretical advancements across dynamical, informational, and probabilistic approaches. However, these theories often remain inaccessible to non-theoreticians, and they lack a unifying lens. Here, I introduce the assembly graph as an integrative tool to connect these emerging theories.
View Article and Find Full Text PDFBMC Bioinformatics
January 2025
Technology Park of Sardinia, Bioecopest Srl, SP 55 Km 8.400, Tramariglio, Alghero, SS, Italy.
Background: The increasing availability of sequenced genomes has enabled comparative analyses of various organisms. Numerous tools and online platforms have been developed for this purpose, facilitating the identification of unique features within selected organisms. However, choosing the most appropriate tools can be unclear during the initial stages of analysis, often requiring multiple attempts to match the specific characteristics of the data.
View Article and Find Full Text PDFNat Rev Genet
January 2025
Altos Labs, Cambridge, UK.
Over the past decade, epigenetic clocks have emerged as powerful machine learning tools, not only to estimate chronological and biological age but also to assess the efficacy of anti-ageing, cellular rejuvenation and disease-preventive interventions. However, many computational and statistical challenges remain that limit our understanding, interpretation and application of epigenetic clocks. Here, we review these computational challenges, focusing on interpretation, cell-type heterogeneity and emerging single-cell methods, aiming to provide guidelines for the rigorous construction of interpretable epigenetic clocks at cell-type and single-cell resolution.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!