Principal components (PCs) are widely used in statistics and refer to a relatively small number of uncorrelated variables derived from an initial pool of variables, while explaining as much of the total variance as possible. Also in statistical genetics, principal component analysis (PCA) is a popular technique. To achieve optimal results, a thorough understanding about the different implementations of PCA is required and their impact on study results, compared to alternative approaches. In this review, we focus on the possibilities, limitations and role of PCs in ancestry prediction, genome-wide association studies, rare variants analyses, imputation strategies, meta-analysis and epistasis detection. We also describe several variations of classic PCA that deserve increased attention in statistical genetics applications.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bby081DOI Listing

Publication Analysis

Top Keywords

statistical genetics
12
principal components
8
genetics principal
8
principals principal
4
components statistical
4
components pcs
4
pcs statistics
4
statistics refer
4
refer small
4
small number
4

Similar Publications

Observational studies have shown that the risk of developing herpes zoster (HZ) increases with the use of statins. However, there are many confounding factors in observational studies. Therefore, our Mendelian randomization (MR) study aimed to explore the causal role of lipids in HZ and to assess the causal impact of lipid-lowering drug targets on HZ risk.

View Article and Find Full Text PDF

Benign paroxysmal vertigo (BPV) is a common cause of dizziness, and some patients are comorbid with psychiatric disorders such as depression, requiring intervention with antidepressants. However, the causal association between BPV, depression and antidepressants has not been clearly established. We used two-sample bidirectional Mendelian randomization (MR) to analyze the causal association between BPV, depression, and antidepressants.

View Article and Find Full Text PDF

Experiencing a traumatic event may lead to Posttraumatic Stress Disorder (PTSD), including symptoms such as flashbacks and hyperarousal. Individuals suffering from PTSD are at increased risk of cardiovascular disease (CVD), but it is unclear why. This study assesses shared genetic liability and potential causal pathways between PTSD and CVD.

View Article and Find Full Text PDF

Spatial protein expression technologies can map cellular content and organization by simultaneously quantifying the expression of >40 proteins at subcellular resolution within intact tissue sections and cell lines. However, necessary image segmentation to single cells is challenging and error prone, easily confounding the interpretation of cellular phenotypes and cell clusters. To address these limitations, we present STARLING, a probabilistic machine learning model designed to quantify cell populations from spatial protein expression data while accounting for segmentation errors.

View Article and Find Full Text PDF

Rapid and accurate multi-phenotype imputation for millions of individuals.

Nat Commun

January 2025

Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs & Fisheries college, Jimei University, Xiamen, Fujian, People's Republic of China.

Deep phenotyping can enhance the power of genetic analysis, including genome-wide association studies (GWAS), but the occurrence of missing phenotypes compromises the potential of such resources. Although many phenotypic imputation methods have been developed, the accurate imputation of millions of individuals remains challenging. In the present study, we have developed a multi-phenotype imputation method based on mixed fast random forest (PIXANT) by leveraging efficient machine learning (ML)-based algorithms.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!