Motivation: The confusion of acute inflammation infected by virus and bacteria or noninfectious inflammation will lead to missing the best therapy occasion resulting in poor prognoses. The diagnostic model based on host gene expression has been widely used to diagnose acute infections, but the clinical usage was hindered by the capability across different samples and cohorts due to the small sample size for signature training and discovery.
Results: Here, we construct a large-scale dataset integrating multiple host transcriptomic data and analyze it using a sophisticated strategy which removes batch effect and extracts the common information from different cohorts based on the relative expression alteration of gene pairs.