Publications by authors named "Emily M LeProust"

Integral membrane proteins (MPs) are key engineering targets due to their critical roles in regulating cell function. In engineering MPs, it can be extremely challenging to retain membrane localization capability while changing other desired properties. We have used structure-guided SCHEMA recombination to create a large set of functionally diverse chimeras from three sequence-diverse channelrhodopsins (ChRs).

View Article and Find Full Text PDF

Ethnic-specific differences in minor allele frequency impact variant categorization for genetic screening of nonsyndromic hearing loss (NSHL) and other genetic disorders. We sought to evaluate all previously reported pathogenic NSHL variants in the context of a large number of controls from ethnically distinct populations sequenced with orthogonal massively parallel sequencing methods. We used HGMD, ClinVar, and dbSNP to generate a comprehensive list of reported pathogenic NSHL variants and re-evaluated these variants in the context of 8,595 individuals from 12 populations and 6 ethnically distinct major human evolutionary phylogenetic groups from three sources (Exome Variant Server, 1000 Genomes project, and a control set of individuals created for this study, the OtoDB).

View Article and Find Full Text PDF

Reprogramming of somatic cells to induced pluripotent stem cells (iPSCs) holds enormous promise for regenerative medicine. To elucidate endogenous barriers limiting this process, we systematically dissected human cellular reprogramming by combining a genome-wide RNAi screen, innovative computational methods, extensive single-hit validation, and mechanistic investigation of relevant pathways and networks. We identify reprogramming barriers, including genes involved in transcription, chromatin regulation, ubiquitination, dephosphorylation, vesicular transport, and cell adhesion.

View Article and Find Full Text PDF

Misfolded ER proteins are retrotranslocated into the cytosol for degradation via the ubiquitin-proteasome system. The human cytomegalovirus protein US11 exploits this ER-associated protein degradation (ERAD) pathway to downregulate HLA class I molecules in virus-infected cells, thereby evading elimination by cytotoxic T-lymphocytes. US11-mediated degradation of HLA class I has been instrumental in the identification of key components of mammalian ERAD, including Derlin-1, p97, VIMP and SEL1L.

View Article and Find Full Text PDF

Background: MicroRNAs (miRNAs) are established regulators of development, cell identity and disease. Although nearly two thousand human miRNA genes are known and new ones are continuously discovered, no attempt has been made to gauge the total miRNA content of the human genome.

Results: Employing an innovative computational method on massively pooled small RNA sequencing data, we report 2,469 novel human miRNA candidates of which 1,098 are validated by in-house and published experiments.

View Article and Find Full Text PDF

Genomes contain both a genetic code specifying amino acids and a regulatory code specifying transcription factor (TF) recognition sequences. We used genomic deoxyribonuclease I footprinting to map nucleotide resolution TF occupancy across the human exome in 81 diverse cell types. We found that ~15% of human codons are dual-use codons ("duons") that simultaneously specify both amino acids and TF recognition sites.

View Article and Find Full Text PDF

Background: Non-syndromic hearing loss (NSHL) is the most common sensory impairment in humans. Until recently its extreme genetic heterogeneity precluded comprehensive genetic testing. Using a platform that couples targeted genomic enrichment (TGE) and massively parallel sequencing (MPS) to sequence all exons of all genes implicated in NSHL, we tested 100 persons with presumed genetic NSHL and in so doing established sequencing requirements for maximum sensitivity and defined MPS quality score metrics that obviate Sanger validation of variants.

View Article and Find Full Text PDF

Genetic interaction (GI) maps, comprising pairwise measures of how strongly the function of one gene depends on the presence of a second, have enabled the systematic exploration of gene function in microorganisms. Here, we present a two-stage strategy to construct high-density GI maps in mammalian cells. First, we use ultracomplex pooled shRNA libraries (25 shRNAs/gene) to identify high-confidence hit genes for a given phenotype and effective shRNAs.

View Article and Find Full Text PDF

Digital production, transmission and storage have revolutionized how we access and use information but have also made archiving an increasingly complex task that requires active, continuing maintenance of digital media. This challenge has focused some interest on DNA as an attractive target for information storage because of its capacity for high-density information encoding, longevity under easily achieved conditions and proven track record as an information bearer. Previous DNA-based information storage approaches have encoded only trivial amounts of information or were not amenable to scaling-up, and used no robust error-correction and lacked examination of their cost-efficiency for large-scale information archival.

View Article and Find Full Text PDF

Background: Targeted genomic enrichment (TGE) is a widely used method for isolating and enriching specific genomic regions prior to massively parallel sequencing. To make effective use of sequencer output, barcoding and sample pooling (multiplexing) after TGE and prior to sequencing (post-capture multiplexing) has become routine. While previous reports have indicated that multiplexing prior to capture (pre-capture multiplexing) is feasible, no thorough examination of the effect of this method has been completed on a large number of samples.

View Article and Find Full Text PDF

Immune responses targeting self-proteins (autoantigens) can lead to a variety of autoimmune diseases. Identification of these antigens is important for both diagnostic and therapeutic reasons. However, current approaches to characterize autoantigens have, in most cases, met only with limited success.

View Article and Find Full Text PDF

High-throughput sequencing of targeted genomic loci in large populations is an effective approach for evaluating the contribution of rare variants to disease risk. We evaluated the feasibility of using in-solution hybridization-based target capture on pooled DNA samples to enable cost-efficient population sequencing studies. For this, we performed pooled sequencing of 100 HapMap samples across ∼ 600 kb of DNA sequence using the Illumina GAIIx.

View Article and Find Full Text PDF

Sequencing the coding regions, the exome, of the human genome is one of the major current strategies to identify low frequency and rare variants associated with human disease traits. So far, the most widely used commercial exome capture reagents have mainly targeted the consensus coding sequence (CCDS) database. We report the design of an extended set of targets for capturing the complete human exome, based on annotation from the GENCODE consortium.

View Article and Find Full Text PDF

Development of cheap, high-throughput and reliable gene synthesis methods will broadly stimulate progress in biology and biotechnology. Currently, the reliance on column-synthesized oligonucleotides as a source of DNA limits further cost reductions in gene synthesis. Oligonucleotides from DNA microchips can reduce costs by at least an order of magnitude, yet efforts to scale their use have been largely unsuccessful owing to the high error rates and complexity of the oligonucleotide mixtures.

View Article and Find Full Text PDF

Background And Purpose: Diffusion-weighted magnetic resonance imaging of the brain is a promising technique to help predict functional outcome in comatose survivors of cardiac arrest. We aimed to evaluate prospectively the temporal-spatial profile of brain apparent diffusion coefficient changes in comatose survivors during the first 8 days after cardiac arrest.

Methods: Apparent diffusion coefficient values were measured by 2 independent and blinded investigators in predefined brain regions in 18 good- and 15 poor-outcome patients with 38 brain magnetic resonance imaging scans and were compared with those of 14 normal controls.

View Article and Find Full Text PDF

We have achieved the ability to synthesize thousands of unique, long oligonucleotides (150mers) in fmol amounts using parallel synthesis of DNA on microarrays. The sequence accuracy of the oligonucleotides in such large-scale syntheses has been limited by the yields and side reactions of the DNA synthesis process used. While there has been significant demand for libraries of long oligos (150mer and more), the yields in conventional DNA synthesis and the associated side reactions have previously limited the availability of oligonucleotide pools to lengths <100 nt.

View Article and Find Full Text PDF

To exploit fully the potential of current sequencing technologies for population-based studies, one must enrich for loci from the human genome. Here we evaluate the hybridization-based approach by using oligonucleotide capture probes in solution to enrich for approximately 3.9 Mb of sequence target.

View Article and Find Full Text PDF

We developed a digital RNA allelotyping method for quantitatively interrogating allele-specific gene expression. This method involves ultra-deep sequencing of padlock-captured single-nucleotide polymorphisms (SNPs) from the transcriptome. We characterized four cell lines established from two human subjects in the Personal Genome Project.

View Article and Find Full Text PDF

Short hairpin RNA libraries are limited by low efficacy of many shRNAs and by off-target effects, which give rise to false negatives and false positives, respectively. Here we present a strategy for rapidly creating expanded shRNA pools (approximately 30 shRNAs per gene) that are analyzed by deep sequencing (EXPAND). This approach enables identification of multiple effective target-specific shRNAs from a complex pool, allowing a rigorous statistical evaluation of true hits.

View Article and Find Full Text PDF

Current DNA methylation assays are limited in the flexibility and efficiency of characterizing a large number of genomic targets. We report a method to specifically capture an arbitrary subset of genomic targets for single-molecule bisulfite sequencing for digital quantification of DNA methylation at single-nucleotide resolution. A set of ~30,000 padlock probes was designed to assess methylation of ~66,000 CpG sites within 2,020 CpG islands on human chromosome 12, chromosome 20, and 34 selected regions.

View Article and Find Full Text PDF

Studies of epigenetic modifications would benefit from improved methods for high-throughput methylation profiling. We introduce two complementary approaches that use next-generation sequencing technology to detect cytosine methylation. In the first method, we designed approximately 10,000 bisulfite padlock probes to profile approximately 7,000 CpG locations distributed over the ENCODE pilot project regions and applied them to human B-lymphocytes, fibroblasts and induced pluripotent stem cells.

View Article and Find Full Text PDF

Targeting genomic loci by massively parallel sequencing requires new methods to enrich templates to be sequenced. We developed a capture method that uses biotinylated RNA 'baits' to fish targets out of a 'pond' of DNA fragments. The RNA is transcribed from PCR-amplified oligodeoxynucleotides originally synthesized on a microarray, generating sufficient bait for multiple captures at concentrations high enough to drive the hybridization.

View Article and Find Full Text PDF

Nucleosome organization is critical for gene regulation. In living cells this organization is determined by multiple factors, including the action of chromatin remodellers, competition with site-specific DNA-binding proteins, and the DNA sequence preferences of the nucleosomes themselves. However, it has been difficult to estimate the relative importance of each of these mechanisms in vivo, because in vivo nucleosome maps reflect the combined action of all influencing factors.

View Article and Find Full Text PDF

A new generation of technologies is poised to reduce DNA sequencing costs by several orders of magnitude. But our ability to fully leverage the power of these technologies is crippled by the absence of suitable 'front-end' methods for isolating complex subsets of a mammalian genome at a scale that matches the throughput at which these platforms will routinely operate. We show that targeting oligonucleotides released from programmable microarrays can be used to capture and amplify approximately 10,000 human exons in a single multiplex reaction.

View Article and Find Full Text PDF