Objective: Big Data are increasingly used in obesity and nutrition research to gain new insights and derive personalized guidance; however, this data in raw form are often not usable. Substantial preprocessing, which requires machine learning (ML), human judgment, and specialized software, is required to transform Big Data into artificial intelligence (AI)- and ML-ready data. These preprocessing steps are the most complex part of the entire modeling pipeline.
View Article and Find Full Text PDFBackground: When a lifestyle intervention combines caloric restriction and increased physical activity energy expenditure (PAEE), there are two components of energy balance, energy intake (EI) and physical activity energy expenditure (PAEE), that are routinely misreported and expensive to measure. Energy balance models have successfully predicted EI if PAEE is known. Estimating EI from an energy balance model when PAEE is not known remains an open question.
View Article and Find Full Text PDFIn anticipation of the expanding appreciation for air quality models in health outcomes studies, we develop and evaluate a reduced-complexity model for pollution transport that intentionally sacrifices some of the sophistication of full-scale chemical transport models in order to support applicability to a wider range of health studies. Specifically, we introduce the HYSPLIT average dispersion model, HyADS, which combines the HYSPLIT trajectory dispersion model with modern advances in parallel computing to estimate ZIP code level exposure to emissions from individual coal-powered electricity generating units in the United States. Importantly, the method is not designed to reproduce ambient concentrations of any particular air pollutant; rather, the primary goal is to characterize each ZIP code's exposure to these coal power plants specifically.
View Article and Find Full Text PDFFetal trajectories characterizing growth rates have relied primarily on goodness of fit rather than mechanistic properties exhibited . Here, we use a validated fetal-placental allometric scaling law and a first principles differential equations model of placental volume growth to generate biologically meaningful fetal-placental growth curves. The growth curves form the foundation for understanding healthy versus at-risk fetal growth and for identifying the timing of key events .
View Article and Find Full Text PDFThis paper discusses the influence that decisions about data cleaning and violations of statistical assumptions can have on drawing valid conclusions to research studies. The datasets provided in this paper were collected as part of a National Science Foundation grant to design online games and associated labs for use in undergraduate and graduate statistics courses that can effectively illustrate issues not always addressed in traditional instruction. Students play the role of a researcher by selecting from a wide variety of independent variables to explain why some students complete games faster than others.
View Article and Find Full Text PDF