When seeking to release public use files for confidential data, statistical agencies can generate fully synthetic data. We propose an approach for making fully synthetic data from surveys collected with complex sampling designs. Our approach adheres to the general strategy proposed by Rubin (1993). Specifically, we generate pseudo-populations by applying the weighted finite population Bayesian bootstrap to account for survey weights, take simple random samples from those pseudo-populations, estimate synthesis models using these simple random samples, and release simulated data drawn from the models as public use files. To facilitate variance estimation, we use the framework of multiple imputation with two data generation strategies. In the first, we generate multiple data sets from each simple random sample. In the second, we generate a single synthetic data set from each simple random sample. We present multiple imputation combining rules for each setting. We illustrate the repeated sampling properties of the combining rules via simulation studies, including comparisons with synthetic data generation based on pseudo-likelihood methods. We apply the proposed methods to a subset of data from the American Community Survey.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11759325 | PMC |
Rapid Commun Mass Spectrom
May 2025
Department of Chemistry, The University of North Texas, Denton, Texas, USA.
Rationale: Fentanyl and fentanyl analogs continue to pose a serious threat to the public health. The vast number of fentanyl analogs emerging on the black-market call for optimized analytical methods for the detection, analysis, and characterization of these extremely dangerous drugs.
Methods: Atmospheric pressure solids analysis probe (ASAP) mass spectrometry was used for the rapid analysis of 250 synthetic opioid standards, including 211 fentanyl analogs, 32 non-fentanyl related opioids, and 8 fentanyl precursors.
Plant Foods Hum Nutr
January 2025
Facultad de Ciencias Químicas, Dirección de Investigaciones, Universidad Nacional de Asunción, P.O. 1055, San Lorenzo, Paraguay.
Concerns over malnutrition, synthetic additives and post-harvest waste highlight the need for innovation in food technology, turning towards underutilized crops. Plant-based beverages offer sustainable dietary alternatives and the increasing demand for such products makes the exploration of native crops particularly relevant. This study focuses on the development of a beverage derived from the native South American fruit kurugua (Sicana odorifera), combined with chia oil (Salvia hispanica L.
View Article and Find Full Text PDFmBio
January 2025
Institute for General Microbiology, Christian-Albrechts-Universität zu Kiel, Kiel, Germany.
Structural maintenance of chromosomes (SMC) are ubiquitously distributed proteins involved in chromosome organization. Deletion of causes severe growth phenotypes in many organisms. Surprisingly, can be deleted in , a member of the phylum, without any apparent growth phenotype.
View Article and Find Full Text PDFNPJ Digit Med
January 2025
Biomedical Data Science Center, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland.
The use of synthetic data is a promising solution to facilitate the sharing and reuse of health-related data beyond its initial collection while addressing privacy concerns. However, there is still no consensus on a standardized approach for systematically evaluating the privacy and utility of synthetic data, impeding its broader adoption. In this work, we present a comprehensive review and systematization of current methods for evaluating synthetic health-related data, focusing on both privacy and utility aspects.
View Article and Find Full Text PDFNat Commun
January 2025
Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, MA, USA.
Recent barcoding technologies allow reconstructing lineage trees while capturing paired single-cell RNA-sequencing (scRNA-seq) data. Such datasets provide opportunities to compare gene expression memory maintenance through lineage branching and pinpoint critical genes in these processes. Here we develop Permutation, Optimization, and Representation learning based single Cell gene Expression and Lineage ANalysis (PORCELAN) to identify lineage-informative genes or subtrees where lineage and expression are tightly coupled.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!