Despite recent advances in algorithmic fairness, methodologies for achieving fairness with generalized linear models (GLMs) have yet to be explored in general, despite GLMs being widely used in practice. In this paper we introduce two fairness criteria for GLMs based on equalizing expected outcomes or log-likelihoods. We prove that for GLMs both criteria can be achieved via a convex penalty term based solely on the linear components of the GLM, thus permitting efficient optimization.
View Article and Find Full Text PDFProc Mach Learn Res
August 2021
The widespread availability of high-dimensional electronic healthcare record (EHR) datasets has led to significant interest in using such data to derive clinical insights and make risk predictions. More specifically, techniques from machine learning are being increasingly applied to the problem of dynamic survival analysis, where updated time-to-event risk predictions are learned as a function of the full covariate trajectory from EHR datasets. EHR data presents unique challenges in the context of dynamic survival analysis, involving a variety of decisions about data representation, modeling, interpretability, and clinically meaningful evaluation.
View Article and Find Full Text PDFProc Mach Learn Res
March 2021
Dynamic survival analysis is a variant of traditional survival analysis where time-to-event predictions are updated as new information arrives about an individual over time. In this paper we propose a new approach to dynamic survival analysis based on learning a global parametric distribution, followed by individualization via truncating and renormalizing that distribution at different locations over time. We combine this approach with a likelihood-based loss that includes predictions at every time step within an individual's history, rather than just including one term per individual.
View Article and Find Full Text PDFIn data collection for predictive modeling, underrepresentation of certain groups, based on gender, race/ethnicity, or age, may yield less accurate predictions for these groups. Recently, this issue of fairness in predictions has attracted significant attention, as data-driven models are increasingly utilized to perform crucial decision-making tasks. Existing methods to achieve fairness in the machine learning literature typically build a single prediction model in a manner that encourages fair prediction performance for all groups.
View Article and Find Full Text PDFIn data collection for predictive modeling, under-representation of certain groups, based on gender, race/ethnicity, or age, may yield less-accurate predictions for these groups. Recently, this issue of fairness in predictions has attracted significant attention, as data-driven models are increasingly utilized to perform crucial decision-making tasks. Existing methods to achieve fairness in the machine learning literature typically build a single prediction model in a manner that encourages fair prediction performance for all groups.
View Article and Find Full Text PDFHigh-throughput single-cell cytometry technologies have significantly improved our understanding of cellular phenotypes to support translational research and the clinical diagnosis of hematological and immunological diseases. However, subjective and ad hoc manual gating analysis does not adequately handle the increasing volume and heterogeneity of cytometry data for optimal diagnosis. Prior work has shown that machine learning can be applied to classify cytometry samples effectively.
View Article and Find Full Text PDFIndividuals create and consume more diverse data about themselves today than any time in history. Sources of this data include wearable devices, images, social media, geo-spatial information and more. A tremendous opportunity rests within cross-modal data analysis that leverages existing domain knowledge methods to understand and guide human health.
View Article and Find Full Text PDF