One of the two chromosomal breakage events in recurring translocations in B cell neoplasms is often due to the recombination-activating gene complex (RAG complex) releasing DNA ends before end joining. The other break occurs in a fragile zone of 20-600 bp in a non-antigen receptor gene locus, with a more complex and intriguing set of mechanistic factors underlying such narrow fragile zones. These factors include activation-induced deaminase (AID), which acts only at regions of single-stranded DNA (ssDNA).
View Article and Find Full Text PDFObjective: Comprehensive and reliable genome-wide variant analysis of a small number of cells has been challenging due to genome coverage bias, PCR over-cycling, and the requirement of expensive technologies. To comprehensively identify genome alterations in single colon crypts that reflect genome heterogeneity of stem cells, we developed a method to construct whole-genome sequencing libraries from single colon crypts without DNA extraction, whole-genome amplification, or increased PCR enrichment cycles.
Results: We present post-alignment statistics of 81 single-crypts (each contains four- to eight-fold less DNA than the requirement of conventional methods) and 16 bulk-tissue libraries to demonstrate the consistent success in obtaining reliable coverage, both in depth (≥ 30X) and breadth (≥ 92% of the genome covered at ≥ 10X depth), of the human genome.
Activation-induced deaminase (AID) only deaminates cytosine within single-stranded DNA. Transcription is known to increase AID deamination on duplex DNA substrates during transcription. Using a purified T7 RNA polymerase transcription system, we recently found that AID deamination of a duplex DNA substrate is reduced if RNase A is added during transcription.
View Article and Find Full Text PDFThe PI3K/Akt pathway-and in particular PI3Kδ-is known for its role in drug resistant B-cell acute lymphoblastic leukemia (B-ALL) and it is often upregulated in refractory or relapsed B-ALL. Myc proteins are transcription factors responsible for transcribing pro-proliferative genes and c-Myc is often overexpressed in cancers. The chromatin regulator BRD4 is required for expression of c-Myc in hematologic malignancies including B-ALL.
View Article and Find Full Text PDFAnalysis of translocation breakpoints in human B cell malignancies reveals that DNA double-strand breaks at oncogenes most frequently occur at CpG sites located within 20-600 bp fragile zones and depend on activation-induced deaminase (AID). AID requires single-stranded DNA (ssDNA) to act, but it has been unclear why or how this region transiently acquires a ssDNA state. Here, we demonstrate the ssDNA state in the 23 bp E2A fragile zone using several methods, including native bisulfite DNA structural analysis in live human pre-B cells.
View Article and Find Full Text PDFIn vertebrate antigen receptor gene rearrangement, V(D)J recombination events can occur by deletion or by inversion. For deletional events, the signal joint is deleted from the genome. Nearly half of the immunoglobulin light chain genes undergo V(D)J recombination in an inversional manner, and both signal and coding joint formation must occur to retain chromosomal integrity.
View Article and Find Full Text PDFBackground: Family history of prostate cancer (PCa) is a well-known risk factor, and both common and rare genetic variants are associated with the disease.
Objective: To detect new genetic variants associated with PCa, capitalizing on the role of family history and more aggressive PCa.
Design, Setting, And Participants: A two-stage design was used.
Spontaneous DNA-PKcs deficiencies in animals result in a severe combined immunodeficiency (SCID) phenotype because DNA-PKcs is required to activate Artemis for V(D)J recombination coding end hairpin opening. The impact on signal joint formation in these spontaneous mutant mammals is variable. Genetically engineered DNA-PKcs null mice and cells from them show a >1,000-fold reduction in coding joint formation and minimal reduction in signal joint formation during V(D)J recombination.
View Article and Find Full Text PDFAm J Physiol Lung Cell Mol Physiol
April 2020
Genome-wide sequencing enables evaluation of associations between traits and combinations of variants in genes and pathways. But such evaluation requires multi-locus association tests with good power, regardless of the variant and trait characteristics. And since analyzing families may yield more power than analyzing unrelated individuals, we need multi-locus tests applicable to both related and unrelated individuals.
View Article and Find Full Text PDFThe mitochondrial genome, which consists of 16,569 bp of DNA with a cytosine-rich light (L) strand and a heavy (H) strand, exists as a multicopy closed circular genome within the mitochondrial matrix. The machinery for replication of the mammalian mitochondrial genome is distinct from that for replication of the nuclear genome. Three models have been proposed for mitochondrial DNA (mtDNA) replication, and one of the key differences among them is whether extensive single-stranded regions exist on the H strand.
View Article and Find Full Text PDFNext-generation sequencing technologies have afforded unprecedented characterization of low-frequency and rare genetic variation. Due to low power for single-variant testing, aggregative methods are commonly used to combine observed rare variation within a single gene. Causal variation may also aggregate across multiple genes within relevant biomolecular pathways.
View Article and Find Full Text PDFThe vast majority of coding variants are rare, and assessment of the contribution of rare variants to complex traits is hampered by low statistical power and limited functional data. Improved methods for predicting the pathogenicity of rare coding variants are needed to facilitate the discovery of disease variants from exome sequencing studies. We developed REVEL (rare exome variant ensemble learner), an ensemble method for predicting the pathogenicity of missense variants on the basis of individual tools: MutPred, FATHMM, VEST, PolyPhen, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP, SiPhy, phyloP, and phastCons.
View Article and Find Full Text PDFGenet Epidemiol
September 2016
Rare variants (RVs) have been shown to be significant contributors to complex disease risk. By definition, these variants have very low minor allele frequencies and traditional single-marker methods for statistical analysis are underpowered for typical sequencing study sample sizes. Multimarker burden-type approaches attempt to identify aggregation of RVs across case-control status by analyzing relatively small partitions of the genome, such as genes.
View Article and Find Full Text PDFImmunoglobulin (Ig) heavy chains undergo class switch recombination (CSR) to change the heavy chain isotype from IgM to IgG, A or E. The switch regions are several kilobases long, repetitive, and G-rich on the nontemplate strand. They are also relatively depleted of CpG (also called CG) sites for unknown reasons.
View Article and Find Full Text PDFBackground: Long genomic R-loops in eukaryotes were first described at the immunoglobulin heavy chain locus switch regions using bisulfite sequencing and functional studies. A mouse monoclonal antibody called S9.6 has been used for immunoprecipitation (IP) to identify R-loops, based on the assumption that it is specific for RNA:DNA over other nucleic acid duplexes.
View Article and Find Full Text PDFBackground: Attempts to enrich or identify DNA with cytosine methylation have been commonly carried out using anti-5-methylcytosine or anti-MBD2 (methyl-CpG binding domain protein 2) antibody in immunoprecipitation (IP) assays. However, a careful and systematic control experiment to examine the sensitivity and specificity of this approach has not been reported. It is of critical importance to understand the potential pitfalls of this approach and to avoid potential misinterpretation of findings.
View Article and Find Full Text PDFChromosomal translocations are a hallmark of hematopoietic malignancies. CG motifs within translocation fragile zones (typically 20 to 600 bp in size) are prone to chromosomal translocation in lymphomas. Here we demonstrate that the CG motifs in human translocation fragile zones are hypomethylated relative to the adjacent DNA.
View Article and Find Full Text PDFThe boundaries of R-loops are well-documented at immunoglobulin heavy chain loci in mammalian B cells. Within primary B cells or B cell lines, the upstream boundaries of R-loops typically begin early in the repetitive portion of the switch regions. Most R-loops terminate within the switch repetitive zone, but the remainder can extend a few hundred base pairs further, where G-density on the non-template DNA strand gradually drops to the genome average.
View Article and Find Full Text PDFR loops exist at the murine IgH switch regions and possibly other locations, but their functional importance is unclear. In biochemical systems, R loop initiation requires DNA sequence regions containing clusters of G nucleotides, but cellular studies have not been done. Here, we vary the G-clustering, total switch region length, and the number of target sites (WGCW sites for the activation-induced deaminase) at synthetic switch regions in a murine B cell line to determine the effect on class switch recombination (CSR).
View Article and Find Full Text PDFOf the three DNA ligases present in all vertebrates, DNA ligase I (Lig1) has been considered essential for ligating Okazaki fragments during DNA replication and thereby essential for cell viability. Here, we report the striking finding that a Lig1-null murine B cell line is viable. Surprisingly, the Lig1-null cells exhibit normal proliferation and normal immunoglobulin heavy chain class switch recombination and are not hypersensitive to a wide variety of DNA damaging agents.
View Article and Find Full Text PDFBackground: Illumina SNP arrays have been routinely used for genome-wide association studies to identify potential biomarkers for various diseases. The recommended 200 ng of DNA for high-quality results is a roadblock to utilizing this assay when such quantities of DNA are not available. The goal of this study is to determine the reproducibility and reliability of the assay when reduced amounts of DNA are used for the SNP arrays.
View Article and Find Full Text PDFHum Genet
March 2014
Previous GWAS studies have reported significant associations between various common SNPs and prostate cancer risk using cases unselected for family history. How these variants influence risk in familial prostate cancer is not well studied. Here, we analyzed 25 previously reported SNPs across 14 loci from prior prostate cancer GWAS.
View Article and Find Full Text PDFAlthough CpG methylation clearly distributes genome-wide in vertebrate nuclear DNA, the state of methylation in the vertebrate mitochondrial genome has been unclear. Several recent reports using immunoprecipitation, mass spectrometry, and enzyme-linked immunosorbent assay methods concluded that human mitochondrial DNA (mtDNA) has much more than the 2 to 5% CpG methylation previously estimated. However, these methods do not provide information as to the sites or frequency of methylation at each CpG site.
View Article and Find Full Text PDF