The widespread use of devices like mobile phones and wearables allows for automatic monitoring of human daily activities, generating vast datasets that offer insights into long-term human behavior. A structured and controlled data collection process is essential to unlock the full potential of this information. While wearable sensors for physical activity monitoring have gained significant traction in healthcare, sports science, and fitness applications, securing diverse and comprehensive datasets for research and algorithm development poses a notable challenge. In this proof-of-concept study, we underscore the significance of semantic representation in enhancing data interoperability and facilitating advanced analytics for physical activity sensor observations. Our approach focuses on enhancing the usability of physical activity datasets by employing a medical-grade (CE certified) sensor to generate synthetic datasets. Additionally, we provide insights into ethical considerations related to synthetic datasets. The study conducts a comparative analysis between real and synthetic activity datasets, assessing their effectiveness in mitigating model bias and promoting fairness in predictive analysis. We have created an ontology for semantically representing observations from physical activity sensors and conducted predictive analysis on data collected using MOX2-5 activity sensors. Until now, there has been a lack of publicly available datasets for physical activity collected with MOX2-5 activity monitoring medical grade (CE certified) device. The MOX2-5 captures and transmits high-resolution data, including activity intensity, weight-bearing, sedentary, standing, low, moderate, and vigorous physical activity, as well as steps per minute. Our dataset consists of physical activity data collected from 16 adults (Male: 12; Female: 4) over a period of 30-45 days (approximately 1.5 months), yielding a relatively small volume of 539 records. To address this limitation, we employ various synthetic data generation methods, such as Gaussian Capula (GC), Conditional Tabular General Adversarial Network (CTGAN), and Tabular General Adversarial Network (TABGAN), to augment the dataset with synthetic data. For both the authentic and synthetic datasets, we have developed a Multilayer Perceptron (MLP) classification model for accurately classifying daily physical activity levels. The findings underscore the effectiveness of semantic ontology in semantic search, knowledge representation, data integration, reasoning, and capturing meaningful relationships between data. The analysis supports the hypothesis that the efficiency of predictive models improves as the volume of additional synthetic training data increases. Ontology and Generative AI hold the potential to expedite advancements in behavioral monitoring research. The data presented, encompassing both real MOX2-5 and its synthetic counterpart, serves as a valuable resource for developing robust methods in activity type classification. Furthermore, it opens avenues for exploration into research directions related to synthetic data, including model efficiency, detection of generated data, and considerations regarding data privacy.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10897381PMC
http://dx.doi.org/10.1038/s41598-024-55183-6DOI Listing

Publication Analysis

Top Keywords

physical activity
36
synthetic datasets
16
activity
14
data
14
synthetic data
12
synthetic
10
physical
9
datasets
9
semantic representation
8
comparative analysis
8

Similar Publications

Functional effect of physical exercise on calcium metabolism.

J Physiol

January 2025

Clinic for Thoracic and Cardiovascular Surgery, Herz- und Diabeteszentrum NRW, Bad Oeynhausen, Germany.

View Article and Find Full Text PDF

Obesity can change the immune microenvironment of adipose tissue and induce inflammation. This study is dedicated to exploring the internal mechanism by which different intensities of exercise reprogram the immune microenvironment of epididymal adipose tissue in nutritionally obese mice. C57BL/6J male obese mouse models were constructed by high-fat diet, which were respectively obese control group (OC), moderate intensity continuous exercise group (HF-M), high intensity continuous exercise group (HF-H) and high intensity intermittent exercise group (HF-T).

View Article and Find Full Text PDF

Importance: Evolving breast cancer treatments have led to improved outcomes but carry a substantial financial burden. The association of treatment costs with the cost-effectiveness of screening mammography is unknown.

Objective: To determine the cost-effectiveness of population-based breast cancer screening in the context of current treatment standards.

View Article and Find Full Text PDF

Background: Glucose transporter 1 deficiency syndrome (Glut1DS) was initially reported by De Vivo and colleagues in 1991. This disease arises from mutations in the SLC2A1 and presents with a broad clinical spectrum. It is a treatable neuro-metabolic condition, where prompt diagnosis and initiation of ketogenic dietary therapy can markedly enhance the prognosis.

View Article and Find Full Text PDF

Mild cognitive impairment (MCI) refers to cognitive alterations with preservation of functionality. Individuals with this diagnosis have a higher risk of developing dementia. Non-pharmacological interventions, such as physical exercise, are beneficial for the cognition of this population.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!