Synthetic data generation with probabilistic Bayesian Networks.

Math Biosci Eng

Department of Computational and Quantitative Medicine, Beckman Research Institute, and Diabetes and Metabolism Research Institute, City of Hope National Medical Center, 1500 East Duarte Road, Duarte, CA 91010 USA.

Published: October 2021

AI Article Synopsis

  • Bayesian Network (BN) modeling is a growing method in systems biology, used to create network graphs that reflect biological relationships from diverse datasets.
  • Various strategies exist to assess BN performance, including artificial benchmarks, biological datasets, and simulation studies, with the latter being the most comprehensive yet often limited by unrealistic assumptions.
  • This study introduces a probabilistic simulation framework for unbiased simulation studies and enhances the understanding of causality, dependence, and the concept of Markov Blankets in Bayesian Networks.

Article Abstract

Bayesian Network (BN) modeling is a prominent and increasingly popular computational systems biology method. It aims to construct network graphs from the large heterogeneous biological datasets that reflect the underlying biological relationships. Currently, a variety of strategies exist for evaluating BN methodology performance, ranging from utilizing artificial benchmark datasets and models, to specialized biological benchmark datasets, to simulation studies that generate synthetic data from predefined network models. The last is arguably the most comprehensive approach; however, existing implementations often rely on explicit and implicit assumptions that may be unrealistic in a typical biological data analysis scenario, or are poorly equipped for automated arbitrary model generation. In this study, we develop a purely probabilistic simulation framework that addresses the demands of statistically sound simulations studies in an unbiased fashion. Additionally, we expand on our current understanding of the theoretical notions of causality and dependence / conditional independence in BNs and the Markov Blankets within.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8848551PMC
http://dx.doi.org/10.3934/mbe.2021426DOI Listing

Publication Analysis

Top Keywords

synthetic data
8
benchmark datasets
8
data generation
4
generation probabilistic
4
probabilistic bayesian
4
bayesian networks
4
networks bayesian
4
bayesian network
4
network modeling
4
modeling prominent
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!