The emergence of high-throughput DNA sequencing methods provides unprecedented opportunities to further unravel bacterial biodiversity and its worldwide role from human health to ecosystem functioning. However, despite the abundance of sequencing studies, combining data from multiple individual studies to address macroecological questions of bacterial diversity remains methodically challenging and plagued with biases. Here, using a machine-learning approach that accounts for differences among studies and complex interactions among taxa, we merge 30 independent bacterial data sets comprising 1,998 soil samples from 21 countries.
View Article and Find Full Text PDF