Synthetic data for design and evaluation of binary classifiers in the context of Bayesian transfer learning.

Omar Maddouri Xiaoning Qian Francis J Alexander Edward R Dougherty Byung-Jun Yoon

Data Brief

Department of Electrical and Computer Engineering, Texas A&M University, College Station TX 77843, USA.

Published: June 2022

Transfer learning (TL) techniques can enable effective learning in data scarce domains by allowing one to re-purpose data or scientific knowledge available in relevant source domains for predictive tasks in a target domain of interest. In this Data in Brief article, we present a synthetic dataset for binary classification in the context of Bayesian transfer learning, which can be used for the design and evaluation of TL-based classifiers. For this purpose, we consider numerous combinations of classification settings, based on which we simulate a diverse set of feature-label distributions with varying learning complexity. For each set of model parameters, we provide a pair of target and source datasets that have been jointly sampled from the underlying feature-label distributions in the target and source domains, respectively. For both target and source domains, the data in a given class and domain are normally distributed, where the distributions across domains are related to each other through a joint prior. To ensure the consistency of the classification complexity across the provided datasets, we have controlled the Bayes error such that it is maintained within a range of predefined values that mimic realistic classification scenarios across different relatedness levels. The provided datasets may serve as useful resources for designing and benchmarking transfer learning schemes for binary classification as well as the estimation of classification error.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9011006	PMC
http://dx.doi.org/10.1016/j.dib.2022.108113	DOI Listing

Publication Analysis

Top Keywords

transfer learning

source domains

target source

design evaluation

context bayesian

bayesian transfer

binary classification

feature-label distributions

provided datasets

learning

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!