Publications by authors named "William Majoros"

Variation in the non-coding genome represents an understudied mechanism of disease and it remains challenging to predict if single nucleotide variants, small insertions and deletions, or structural variants in non-coding genomic regions will be detrimental. Our approach using complementary RNA-seq and targeted long-read DNA sequencing can prioritize identification of non-coding variants that lead to disease via alteration of gene splicing or expression. We have identified a patient with primary ciliary dyskinesia with a pathogenic coding variant on one allele of the SPAG1 gene, while the second allele appears normal by whole exome sequencing despite an autosomal recessive inheritance pattern.

View Article and Find Full Text PDF
Article Synopsis
  • Recent advances in single-cell RNA sequencing (scRNA-seq) and CRISPR technology allow researchers to explore how genetic changes affect gene expression in a high-throughput manner.
  • A key challenge in these experiments is managing "ambient gRNAs," which are extraneous genetic markers from other cells that can skew results if not filtered out properly.
  • The study introduces a tool called CLEANSER, which effectively distinguishes and removes ambient gRNA noise, improving the accuracy of gRNA-cell assignments and enhancing the quality of differential gene expression analyses.
View Article and Find Full Text PDF

Motivation: Allele-specific expression (ASE) analyses aim to detect imbalanced expression of maternal versus paternal copies of an autosomal gene. Such allelic imbalance can result from a variety of cis-acting causes, including disruptive mutations within one copy of a gene that impact the stability of transcripts, as well as regulatory variants outside the gene that impact transcription initiation. Current methods for ASE estimation suffer from a number of shortcomings, such as relying on only one variant within a gene, assuming perfect phasing information across multiple variants within a gene, or failing to account for alignment biases and possible genotyping errors.

View Article and Find Full Text PDF

Targeted gene-editing strategies have emerged as promising therapeutic approaches for the permanent treatment of inherited genetic diseases. However, precise gene correction and insertion approaches using homology-directed repair are still limited by low efficiencies. Consequently, many gene-editing strategies have focused on removal or disruption, rather than repair, of genomic DNA.

View Article and Find Full Text PDF

Despite widespread clinical genetic testing, many individuals with suspected genetic conditions lack a precise diagnosis, limiting their opportunity to take advantage of state-of-the-art treatments. In some cases, testing reveals difficult-to-evaluate structural differences, candidate variants that do not fully explain the phenotype, single pathogenic variants in recessive disorders, or no variants in genes of interest. Thus, there is a need for better tools to identify a precise genetic diagnosis in individuals when conventional testing approaches have been exhausted.

View Article and Find Full Text PDF

High-throughput reporter assays such as self-transcribing active regulatory region sequencing (STARR-seq) have made it possible to measure regulatory element activity across the entire human genome at once. The resulting data, however, present substantial analytical challenges. Here, we identify technical biases that explain most of the variance in STARR-seq data.

View Article and Find Full Text PDF

Changes in transcriptional regulation are thought to be a major contributor to the evolution of phenotypic traits, but the contribution of changes in chromatin accessibility to the evolution of gene expression remains almost entirely unknown. To address this important gap in knowledge, we developed a new method to identify DNase I Hypersensitive (DHS) sites with differential chromatin accessibility between species using a joint modeling approach. Our method overcomes several limitations inherent to conventional threshold-based pairwise comparisons that become increasingly apparent as the number of species analyzed rises.

View Article and Find Full Text PDF

Motivation: High-throughput reporter assays dramatically improve our ability to assign function to noncoding genetic variants, by measuring allelic effects on gene expression in the controlled setting of a reporter gene. Unlike genetic association tests, such assays are not confounded by linkage disequilibrium when loci are independently assayed. These methods can thus improve the identification of causal disease mutations.

View Article and Find Full Text PDF

Environmental stimuli commonly act via changes in gene regulation. Human-genome-scale assays to measure such responses are indirect or require knowledge of the transcription factors (TFs) involved. Here, we present the use of human genome-wide high-throughput reporter assays to measure environmentally-responsive regulatory element activity.

View Article and Find Full Text PDF

Glucocorticoids are potent steroid hormones that regulate immunity and metabolism by activating the transcription factor (TF) activity of glucocorticoid receptor (GR). Previous models have proposed that DNA binding motifs and sites of chromatin accessibility predetermine GR binding and activity. However, there are vast excesses of both features relative to the number of GR binding sites.

View Article and Find Full Text PDF

The glucocorticoid receptor (GR) is a hormone-inducible transcription factor involved in metabolic and anti-inflammatory gene expression responses. To investigate what controls interactions between GR binding sites and their target genes, we used in situ Hi-C to generate high-resolution, genome-wide maps of chromatin interactions before and after glucocorticoid treatment. We found that GR binding to the genome typically does not cause new chromatin interactions to target genes but instead acts through chromatin interactions that already exist prior to hormone treatment.

View Article and Find Full Text PDF

Motivation: Genetic variation that disrupts gene function by altering gene splicing between individuals can substantially influence traits and disease. In those cases, accurately predicting the effects of genetic variation on splicing can be highly valuable for investigating the mechanisms underlying those traits and diseases. While methods have been developed to generate high quality computational predictions of gene structures in reference genomes, the same methods perform poorly when used to predict the potentially deleterious effects of genetic changes that alter gene splicing between individuals.

View Article and Find Full Text PDF

There is broad agreement that genetic mutations occurring outside of the protein-coding regions play a key role in human disease. Despite this consensus, we are not yet capable of discerning which portions of non-coding sequence are important in the context of human disease. Here, we present Orion, an approach that detects regions of the non-coding genome that are depleted of variation, suggesting that the regions are intolerant of mutations and subject to purifying selection in the human lineage.

View Article and Find Full Text PDF

Motivation: The accurate interpretation of genetic variants is critical for characterizing genotype-phenotype associations. Because the effects of genetic variants can depend strongly on their local genomic context, accurate genome annotations are essential. Furthermore, as some variants have the potential to disrupt or alter gene structure, variant interpretation efforts stand to gain from the use of individualized annotations that account for differences in gene structure between individuals or strains.

View Article and Find Full Text PDF

The glucocorticoid receptor (GR) binds the human genome at >10,000 sites but only regulates the expression of hundreds of genes. To determine the functional effect of each site, we measured the glucocorticoid (GC) responsive activity of nearly all GR binding sites (GBSs) captured using chromatin immunoprecipitation (ChIP) in A549 cells. 13% of GBSs assayed had GC-induced activity.

View Article and Find Full Text PDF

Research on the genetics of natural populations was revolutionized in the 1990s by methods for genotyping noninvasively collected samples. However, these methods have remained largely unchanged for the past 20 years and lag far behind the genomics era. To close this gap, here we report an optimized laboratory protocol for genome-wide capture of endogenous DNA from noninvasively collected samples, coupled with a novel computational approach to reconstruct pedigree links from the resulting low-coverage data.

View Article and Find Full Text PDF

We report a novel high-throughput method to empirically quantify individual-specific regulatory element activity at the population scale. The approach combines targeted DNA capture with a high-throughput reporter gene expression assay. As demonstration, we measured the activity of more than 100 putative regulatory elements from 95 individuals in a single experiment.

View Article and Find Full Text PDF

The CRISPR/Cas9 genome-editing platform is a promising technology to correct the genetic basis of hereditary diseases. The versatility, efficiency and multiplexing capabilities of the CRISPR/Cas9 system enable a variety of otherwise challenging gene correction strategies. Here, we use the CRISPR/Cas9 system to restore the expression of the dystrophin gene in cells carrying dystrophin mutations that cause Duchenne muscular dystrophy (DMD).

View Article and Find Full Text PDF

Duchenne muscular dystrophy (DMD) is caused by genetic mutations that result in the absence of dystrophin protein expression. Oligonucleotide-induced exon skipping can restore the dystrophin reading frame and protein production. However, this requires continuous drug administration and may not generate complete skipping of the targeted exon.

View Article and Find Full Text PDF

Motivation: High-throughput sequencing of RNA in vivo facilitates many applications, not the least of which is the cataloging of variant splice isoforms of protein-coding messenger RNAs. Although many solutions have been proposed for reconstructing putative isoforms from deep sequencing data, these generally take as their substrate the collective alignment structure of RNA-seq reads and ignore the biological signals present in the actual nucleotide sequence. The majority of these solutions are graph-theoretic, relying on a splice graph representing the splicing patterns and exon expression levels indicated by the spliced-alignment process.

View Article and Find Full Text PDF

Motivation: Computational approaches for the annotation of phenotypes from image data have shown promising results across many applications, and provide rich and valuable information for studying gene function and interactions. While data are often available both at high spatial resolution and across multiple time points, phenotypes are frequently annotated independently, for individual time points only. In particular, for the analysis of developmental gene expression patterns, it is biologically sensible when images across multiple time points are jointly accounted for, such that spatial and temporal dependencies are captured simultaneously.

View Article and Find Full Text PDF

High-throughput sequencing has opened numerous possibilities for the identification of regulatory RNA-binding events. Cross-linking and immunoprecipitation of Argonaute proteins can pinpoint a microRNA (miRNA) target site within tens of bases but leaves the identity of the miRNA unresolved. A flexible computational framework, microMUMMIE, integrates sequence with cross-linking features and reliably identifies the miRNA family involved in each binding event.

View Article and Find Full Text PDF

Erythrocytes carrying a variant hemoglobin allele (HbS), which causes sickle cell disease and resists infection by the malaria parasite Plasmodium falciparum. The molecular basis of this resistance, which has long been recognized as multifactorial, remains incompletely understood. Here we show that the dysregulated microRNA (miRNA) composition, of either heterozygous HbAS or homozygous HbSS erythrocytes, contributes to resistance against P.

View Article and Find Full Text PDF

The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny.

View Article and Find Full Text PDF