Genomic data integration tutorial, a plant case study.

Emile Mardoc Mamadou Dia Sow Sébastien Déjean Jérôme Salse

BMC Genomics

UCA-INRAE UMR 1095 Genetics, Diversity and Ecophysiology of Cereals (GDEC), 5 Chemin de Beaulieu, 63000, Clermont-Ferrand, France.

Published: January 2024

Background: The ongoing evolution of the Next Generation Sequencing (NGS) technologies has led to the production of genomic data on a massive scale. While tools for genomic data integration and analysis are becoming increasingly available, the conceptual and analytical complexities still represent a great challenge in many biological contexts.

Results: To address this issue, we describe a six-steps tutorial for the best practices in genomic data integration, consisting of (1) designing a data matrix; (2) formulating a specific biological question toward data description, selection and prediction; (3) selecting a tool adapted to the targeted questions; (4) preprocessing of the data; (5) conducting preliminary analysis, and finally (6) executing genomic data integration.

Conclusion: The tutorial has been tested and demonstrated on publicly available genomic data generated from poplar (Populus L.), a woody plant model. We also developed a new graphical output for the unsupervised multi-block analysis, cimDiablo_v2, available at https://forgemia.inra.fr/umr-gdec/omics-integration-on-poplar , and allowing the selection of master drivers in genomic data variation and interplay.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10792847	PMC
http://dx.doi.org/10.1186/s12864-023-09833-0	DOI Listing

Publication Analysis

Top Keywords

genomic data

data integration

data

genomic

integration tutorial

tutorial plant

plant case

case study

study background

background ongoing

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!