A sampling approach to Debiasing the offline evaluation of recommender systems.

J Intell Inf Syst

Insight Centre for Data Analytics, School of Computer Science, Information Technology, University College Cork, Cork, Ireland.

Published: July 2021

Offline evaluation of recommender systems (RSs) mostly relies on historical data, which is often biased. The bias is a result of many confounders that affect the data collection process. In such biased data, user-item interactions are Missing Not At Random (MNAR). Measures of recommender system performance on MNAR test data are unlikely to be reliable indicators of real-world performance unless something is done to mitigate the bias. One widespread way that researchers try to obtain less biased offline evaluation is by designing new, supposedly unbiased performance metrics for use on MNAR test data. We investigate an alternative solution, a . The general idea is to use a sampling strategy on MNAR data to generate an test set with less bias - one in which interactions are Missing At Random (MAR) or, at least, one that is more MAR-like. An existing example of this approach is SKEW, a sampling strategy that aims to adjust for the confounding effect that an item's popularity has on its likelihood of being observed. In this paper, after extensively surveying the literature on the bias problem in the offline evaluation of RSs, we propose and formulate a novel sampling approach, which we call WTD; we also propose a more practical variant, which we call WTD_H. We compare our methods to SKEW and to two baselines which perform a random intervention on MNAR data. We empirically validate for the first time the effectiveness of SKEW and we show our approach to be a better estimator of the performance that one would obtain on (unbiased) MAR test data. Our strategy benefits from high generality (e.g. it can also be employed for training a recommender) and low overheads (e.g. it does not require any learning).

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9001624PMC
http://dx.doi.org/10.1007/s10844-021-00651-yDOI Listing

Publication Analysis

Top Keywords

offline evaluation
16
test data
12
sampling approach
8
evaluation recommender
8
recommender systems
8
data
8
interactions missing
8
missing random
8
mnar test
8
sampling strategy
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!