Background: Response to antipsychotic drugs (APD) varies greatly among individuals and is affected by genetic factors. This study aims to demonstrate genome-wide associations between copy number variants (CNVs) and response to APD in patients with schizophrenia.
Methods: A total of 3030 patients of Han Chinese ethnicity randomly received APD (aripiprazole, olanzapine, quetiapine, risperidone, ziprasidone, haloperidol and perphenazine) treatment for six weeks.
It has been well established that cancer cells can evade immune surveillance by mutating themselves. Understanding genetic alterations in cancer cells that contribute to immune regulation could lead to better immunotherapy patient stratification and identification of novel immune-oncology (IO) targets. In this report, we describe our effort of genome-wide association analyses across 22 TCGA cancer types to explore the associations between genetic alterations in cancer cells and 74 immune traits.
View Article and Find Full Text PDFThe ongoing release of large-scale sequencing data in the UK Biobank allows for the identification of associations between rare variants and complex traits. SAIGE-GENE+ is a valid approach to conducting set-based association tests for quantitative and binary traits. However, for ordinal categorical phenotypes, applying SAIGE-GENE+ with treating the trait as quantitative or binarizing the trait can cause inflated type I error rates or power loss.
View Article and Find Full Text PDFBackground And Hypothesis: Complex schizophrenia symptoms were recently conceptualized as interactive symptoms within a network system. However, it remains unknown how a schizophrenia network changed during acute antipsychotic treatment. The present study aimed to evaluate the interactive change of schizophrenia symptoms under seven antipsychotics from individual time series.
View Article and Find Full Text PDFSeveral biobanks, including UK Biobank (UKBB), are generating large-scale sequencing data. An existing method, SAIGE-GENE, performs well when testing variants with minor allele frequency (MAF) ≤ 1%, but inflation is observed in variance component set-based tests when restricting to variants with MAF ≤ 0.1% or 0.
View Article and Find Full Text PDFOptimal methods could effectively improve the accuracy of predicting and identifying candidate driver genes. Various computational methods based on mutational frequency, network and function approaches have been developed to identify mutation driver genes in cancer genomes. However, a comprehensive evaluation of the performance levels of network-, function- and frequency-based methods is lacking.
View Article and Find Full Text PDFNucleic Acids Res
January 2022
Rapid advances in high-throughput sequencing technologies have led to the discovery of thousands of extrachromosomal circular DNAs (eccDNAs) in the human genome. Loss-of-function experiments are difficult to conduct on circular and linear chromosomes, as they usually overlap. Hence, it is challenging to interpret the molecular functions of eccDNAs.
View Article and Find Full Text PDFIndividuals with monogenic disorders can experience variable phenotypes that are influenced by genetic variation. To investigate this in sickle cell disease (SCD), we performed whole-genome sequencing (WGS) of 722 individuals with hemoglobin HbSS or HbSβ0-thalassemia from Baylor College of Medicine and from the St. Jude Children's Research Hospital Sickle Cell Clinical Research and Intervention Program (SCCRIP) longitudinal cohort study.
View Article and Find Full Text PDFWith the advances in genotyping technologies and electronic health records (EHRs), large biobanks have been great resources to identify novel genetic associations and gene-environment interactions on a genome-wide and even a phenome-wide scale. To date, several phenome-wide association studies (PheWAS) have been performed on biobank data, which provides comprehensive insights into many aspects of human genetics and biology. Although inspiring, PheWAS on large-scale biobank data encounters new challenges including computational burden, unbalanced phenotypic distribution, and genetic relationship.
View Article and Find Full Text PDFThe histone mark H3K27me3 and its reader/writer polycomb repressive complex 2 (PRC2) mediate widespread transcriptional repression in stem and progenitor cells. Mechanisms that regulate this activity are critical for hematopoietic development but are poorly understood. Here we show that the E3 ubiquitin ligase F-box only protein 11 (FBXO11) relieves PRC2-mediated repression during erythroid maturation by targeting its newly identified substrate bromo adjacent homology domain-containing 1 (BAHD1), an H3K27me3 reader that recruits transcriptional corepressors.
View Article and Find Full Text PDFWith increasing biobanking efforts connecting electronic health records and national registries to germline genetics, the time-to-event data analysis has attracted increasing attention in the genetics studies of human diseases. In time-to-event data analysis, the Cox proportional hazards (PH) regression model is one of the most used approaches. However, existing methods and tools are not scalable when analyzing a large biobank with hundreds of thousands of samples and endpoints, and they are not accurate when testing low-frequency and rare variants.
View Article and Find Full Text PDFWith very large sample sizes, biobanks provide an exciting opportunity to identify genetic components of complex traits. To analyze rare variants, region-based multiple-variant aggregate tests are commonly used to increase power for association tests. However, because of the substantial computational cost, existing region-based tests cannot analyze hundreds of thousands of samples while accounting for confounders such as population stratification and sample relatedness.
View Article and Find Full Text PDFAlzheimer's disease (AD) displays a long asymptomatic stage before dementia. We characterize AD stage-associated molecular networks by profiling 14,513 proteins and 34,173 phosphosites in the human brain with mass spectrometry, highlighting 173 protein changes in 17 pathways. The altered proteins are validated in two independent cohorts, showing partial RNA dependency.
View Article and Find Full Text PDFIn biobank data analysis, most binary phenotypes have unbalanced case-control ratios, and this can cause inflation of type I error rates. Recently, a saddle point approximation (SPA) based single-variant test has been developed to provide an accurate and scalable method to test for associations of such phenotypes. For gene- or region-based multiple-variant tests, a few methods exist that can adjust for unbalanced case-control ratios; however, these methods are either less accurate when case-control ratios are extremely unbalanced or not scalable for large data analyses.
View Article and Find Full Text PDFThe etiology of most complex diseases involves genetic variants, environmental factors, and gene-environment interaction (G × E) effects. Compared with marginal genetic association studies, G × E analysis requires more samples and detailed measure of environmental exposures, and this limits the possible discoveries. Large-scale population-based biobanks with detailed phenotypic and environmental information, such as UK-Biobank, can be ideal resources for identifying G × E effects.
View Article and Find Full Text PDFChronic blood transfusions in patients with sickle cell anemia (SCA) cause iron overload, which occurs with a degree of interpatient variability in serum ferritin and liver iron content (LIC). Reasons for this variability are unclear and may be influenced by genes that regulate iron metabolism. We evaluated the association of the copy number of the glutathione S-transferase M1 () gene and degree of iron overload among patients with SCA.
View Article and Find Full Text PDFIn epidemiology cohort studies, exposure data are collected in sub-studies based on a primary outcome (PO) of interest, as with the extreme-value sampling design (EVSD), to investigate their correlation. Secondary outcomes (SOs) data are also readily available, enabling researchers to assess the correlations between the exposure and the SOs. However, when the EVSD is used, the data for SOs are not representative samples of a general population; thus, many commonly used statistical methods, such as the generalized linear model (GLM), are not valid.
View Article and Find Full Text PDFAlthough many recent studies describe the emergence and prevalence of "clonal hematopoiesis of indeterminate potential" in aged human populations, a systematic analysis of the numbers of clones supporting steady-state hematopoiesis throughout mammalian life is lacking. Previous efforts relied on transplantation of "barcoded" hematopoietic stem cells (HSCs) to track the contribution of HSC clones to reconstituted blood. However, ex vivo manipulation and transplantation alter HSC function and thus may not reflect the biology of steady-state hematopoiesis.
View Article and Find Full Text PDFThe embryonic site of definitive hematopoietic stem cell (dHSC) origination has been debated for decades. Although an intra-embryonic origin is well supported, the yolk sac (YS) contribution to adult hematopoiesis remains controversial. The same developmental origin makes it difficult to identify specific markers that discern between an intraembryonic versus YS-origin using a lineage trace approach.
View Article and Find Full Text PDFIt has been well acknowledged that methods for secondary trait (ST) association analyses under a case-control design (ST$_{\text{CC}}$) should carefully consider the sampling process to avoid biased risk estimates. A similar situation also exists in the extreme phenotype sequencing (EPS) designs, which is to select subjects with extreme values of continuous primary phenotype for sequencing. EPS designs are commonly used in modern epidemiological and clinical studies such as the well-known National Heart, Lung, and Blood Institute Exome Sequencing Project.
View Article and Find Full Text PDFGenome-wide association studies have discovered many biologically important associations of genes with phenotypes. Typically, genome-wide association analyses formally test the association of each genetic feature (SNP, CNV, etc) with the phenotype of interest and summarize the results with multiplicity-adjusted p-values. However, very small p-values only provide evidence against the null hypothesis of no association without indicating which biological model best explains the observed data.
View Article and Find Full Text PDF