Background: The development of sequencing techniques and statistical methods provides great opportunities for identifying the impact of rare genetic variation on complex traits. However, there is a lack of knowledge on the impact of sample size, case numbers, the balance of cases vs controls for both burden and dispersion based rare variant association methods. For example, Phenome-Wide Association Studies may have a wide range of case and control sample sizes across hundreds of diagnoses and traits, and with the application of statistical methods to rare variants, it is important to understand the strengths and limitations of the analyses.
Results: We conducted a large-scale simulation of randomly selected low-frequency protein-coding regions using twelve different balanced samples with an equal number of cases and controls as well as twenty-one unbalanced sample scenarios. We further explored statistical performance of different minor allele frequency thresholds and a range of genetic effect sizes. Our simulation results demonstrate that using an unbalanced study design has an overall higher type I error rate for both burden and dispersion tests compared with a balanced study design. Regression has an overall higher type I error with balanced cases and controls, while SKAT has higher type I error for unbalanced case-control scenarios. We also found that both type I error and power were driven by the number of cases in addition to the case to control ratio under large control group scenarios. Based on our power simulations, we observed that a SKAT analysis with case numbers larger than 200 for unbalanced case-control models yielded over 90% power with relatively well controlled type I error. To achieve similar power in regression, over 500 cases are needed. Moreover, SKAT showed higher power to detect associations in unbalanced case-control scenarios than regression.
Conclusions: Our results provide important insights into rare variant association study designs by providing a landscape of type I error and statistical power for a wide range of sample sizes. These results can serve as a benchmark for making decisions about study design for rare variant analyses.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343276 | PMC |
http://dx.doi.org/10.1186/s12859-018-2591-6 | DOI Listing |
Multivariate Behav Res
December 2024
Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
We present the R package MIIVefa, designed to implement the MIIV-EFA algorithm. This algorithm explores and identifies the underlying factor structure within a set of variables. The resulting model is not a typical exploratory factor analysis (EFA) model because some loadings are fixed to zero and it allows users to include hypothesized correlated errors such as might occur with longitudinal data.
View Article and Find Full Text PDFEpidemics
December 2024
California Department of Public Health Center for Infectious Diseases, 850 Marina Bay Parkway, Richmond, CA 94804, United States. Electronic address:
The effective reproduction number serves as a metric of population-wide, time-varying disease spread. During the early years of the COVID-19 pandemic, this metric was primarily derived from case data, which has varied in quality and representativeness due to changes in testing volume, test-seeking behavior, and resource constraints. Deriving nowcasting estimates from alternative data sources such as wastewater provides complementary information that could inform future public health responses.
View Article and Find Full Text PDFEBioMedicine
December 2024
CeMM Research Centre for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria; Centre for Physiology and Pharmacology, Medical University of Vienna; Vienna, Austria. Electronic address:
Background: High content imaging-based functional precision medicine approaches have been developed and successfully applied in the field of haemato-oncology. For rheumatoid arthritis (RA), treatment selection is still based on a trial-and-error principle, and biomarkers for patient stratification and drug response prediction are needed.
Methods: A high content, high throughput microscopy-based phenotyping pipeline for peripheral blood mononuclear cells (PBMCs) was developed, allowing for the quantification of cell type frequencies, cell type specific morphology and intercellular interactions from patients with RA (n = 65) and healthy controls (HC, n = 33).
J Hum Evol
December 2024
Department of Anthropology, University at Albany (SUNY), 1400 Washington Avenue, Albany, NY 12222, USA; College of Fellows, Institute of Advanced Study, Durham University, Cosin's Hall, Palace Green, Durham, DH1 3RL, UK; Department of Anthropology, Durham University, Dawson Building, South Road, Durham, DH1 3LE, UK. Electronic address:
The degree of sexual size dimorphism in fossil hominins is important evidence for the evaluation of evolutionary hypotheses, but it is also difficult/impossible to measure directly. Multiple methods have been developed to estimate dimorphism in univariate and multivariate datasets, including when data are missing. This paper introduces 'dimorph', an R package that implements many of these methods and associated resampling-based significance tests and evaluates their performance in terms of Type I error rates and power.
View Article and Find Full Text PDFMed Phys
December 2024
Department of Echocardiography, Ultrasound Diagnostic Center, The First Hospital of Jilin University, Changchun, China.
Background: Dialysis Access (DA) stenosis impacts hemodialysis efficiency and patient health, necessitating exams for early lesion detection. Ultrasound is widely used due to its non-invasive, cost-effective nature. Assessing all patients in large hemodialysis facilities strains resources and relies on operator expertise.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!