Publications by authors named "Jiankui He"

The longest possible haplotype is chromosome haplotype that is a set of co-inherited alleles occurred on a single strand chromosome inherited from one parent. Standard whole-genome shotgun sequencing technologies are limited by the inability to independently study the haplotype of homologous chromosomes due to the short-reads sequencing strategy and disturbance of homologue chromosomes. Here, we investigated several types of chromosomal abnormalities by a dilution-based method to separate an intact copy of homologous chromosome from human metaphase cells, and then single chromosomes were independently amplified by whole-genome amplification methods, converted into barcoded sequencing libraries, and sequenced in multiplexed pools by Illumina sequencers.

View Article and Find Full Text PDF

Current approaches of single cell DNA-RNA integrated sequencing are difficult to call SNPs, because a large amount of DNA and RNA is lost during DNA-RNA separation. Here, we performed simultaneous single-cell exome and transcriptome sequencing on individual mouse oocytes. Using microinjection, we kept the nuclei intact to avoid DNA loss, while retaining the cytoplasm inside the cell membrane, to maximize the amount of DNA and RNA captured from the single cell.

View Article and Find Full Text PDF

Next generation sequencing (NGS) has revolutionized life sciences research. However, GC bias and costly, time-intensive library preparation make NGS an ill fit for increasing sequencing demands in the clinic. A new class of third-generation sequencing platforms has arrived to meet this need, capable of directly measuring DNA and RNA sequences at the single-molecule level without amplification.

View Article and Find Full Text PDF

Cell-free DNA (cfDNA) in plasma has emerged as a potential important biomarker in clinical diagnostics, particularly in cancer. However, somatic mutations are also commonly found in healthy individuals, which interfere with the effectiveness for cancer diagnostics. This study examined the background somatic mutations in white blood cells (WBC) and cfDNA in healthy controls based on sequencing data from 821 non-cancer individuals and several cancer samples with the aim of understanding the patterns of mutations detected in cfDNA.

View Article and Find Full Text PDF

Type 1 diabetes mellitus (T1D) is an immune-mediated disease. The autoreactive T cells in T1D patients attack and destroy their own pancreatic cells. In order to systematically investigate the potential autoreactive T cell receptors (TCRs), we used a high-throughput immune repertoire sequencing technique to profile the spectrum of TCRs in individual T1D patients and controls.

View Article and Find Full Text PDF

With the rapid decline in cost of sequencing, it is now affordable to examine multiple genes in a single disease-targeted clinical test using next generation sequencing. Current targeted sequencing methods require a separate step of targeted capture enrichment during sample preparation before sequencing. Although there are fast sample preparation methods available in market, the library preparation process is still relatively complicated for physicians to use routinely.

View Article and Find Full Text PDF

We present Virtual Pharmacist, a web-based platform that takes common types of high-throughput data, namely microarray SNP genotyping data, FASTQ and Variant Call Format (VCF) files as inputs, and reports potential drug responses in terms of efficacy, dosage and toxicity at one glance. Batch submission facilitates multivariate analysis or data mining of targeted groups. Individual analysis consists of a report that is readily comprehensible to patients and practioners who have basic knowledge in pharmacology, a table that summarizes variants and potential affected drug response according to the US Food and Drug Administration pharmacogenomic biomarker labeled drug list and PharmGKB, and visualization of a gene-drug-target network.

View Article and Find Full Text PDF

Warfarin is a drug normally used in the prevention of thrombosis and the formation of blood clots. The dosage of warfarin is strongly affected by genetic variants of CYP2C9 and VKORC1 genes. Current technologies for detecting the variants of these genes are mainly based on real-time PCR.

View Article and Find Full Text PDF

Single-cell genomic analysis has grown rapidly in recent years and finds widespread applications in various fields of biology, including cancer biology, development, immunology, pre-implantation genetic diagnosis, and neurobiology. To date, the amplification bias, amplification uniformity and reproducibility of the three major single cell whole genome amplification methods (GenomePlex WGA4, MDA and MALBAC) have not been systematically investigated using mammalian cells. In this study, we amplified genomic DNA from individual hippocampal neurons using three single-cell DNA amplification methods, and sequenced them at shallow depth.

View Article and Find Full Text PDF

Profiling immune repertoires by high throughput sequencing enhances our understanding of immune system complexity and immune-related diseases in humans. Previously, cloning and Sanger sequencing identified limited numbers of T cell receptor (TCR) nucleotide sequences in rhesus monkeys, thus their full immune repertoire is unknown. We applied multiplex PCR and Illumina high throughput sequencing to study the TCRβ of rhesus monkeys.

View Article and Find Full Text PDF

Warfarin is a drug normally used in the prevention of thrombosis and the formation of blood clots. The dosage of warfarin is strongly affected by genetic variants of CYP2C9 and VKORC1 genes. Current technologies for detecting the variants of these genes are mainly based on real-time PCR.

View Article and Find Full Text PDF

Single cell genomics is a rapidly growing field with many new techniques emerging in the past few years. However, few bioinformatics tools specific for single cell genomics analysis are available. Single cell DNA/RNA sequencing data usually have low genome coverage and high amplification bias, which makes bioinformatics analysis challenging.

View Article and Find Full Text PDF

The human antibody repertoire is one of the most important defenses against infectious disease, and the development of vaccines has enabled the conferral of targeted protection to specific pathogens. However, there are many challenges to measuring and analyzing the immunoglobulin sequence repertoire, including that each B cell's genome encodes a distinct antibody sequence, that the antibody repertoire changes over time, and the high similarity between antibody sequences. We have addressed these challenges by using high-throughput long read sequencing to perform immunogenomic characterization of expressed human antibody repertoires in the context of influenza vaccination.

View Article and Find Full Text PDF

We examine how the structure of the world trade network has been shaped by globalization and recessions over the last 40 years. We show that by treating the world trade network as an evolving system, theory predicts the trade network is more sensitive to recessionary shocks and recovers more slowly from them now than it did 40 years ago, due to structural changes in the world trade network induced by globalization. We also show that recession-induced change to the world trade network leads to an increased hierarchical structure of the global trade network for a few years after the recession.

View Article and Find Full Text PDF

Influenza has been circulating in the human population and has caused three pandemics in the last century (1918 H1N1, 1957 H2N2 and 1968 H3N2). The 2009 A(H1N1) was classified by World Health Organization as the fourth pandemic. Influenza has a high evolution rate, which makes vaccine design challenging.

View Article and Find Full Text PDF

Clustered regularly interspaced short palindromic repeats (CRISPR) in bacterial and archaeal DNA have recently been shown to be a new type of antiviral immune system in these organisms. We here study the diversity of spacers in CRISPR under selective pressure. We propose a population dynamics model that explains the biological observation that the leader-proximal end of CRISPR is more diversified and the leader-distal end of CRISPR is more conserved.

View Article and Find Full Text PDF

An open question in animal evolution is why the phylum- and superphylum-level body plans have changed so little, while the class- and family-level body plans have changed so greatly since the early Cambrian. Davidson and Erwin (Davidson and Erwin, 2006; Erwin and Davidson, 2009) proposed that the hierarchical structure of gene regulatory networks leads to different observed evolutionary rates for terminal properties of the body plan versus major aspects of body plan morphology. Here, we calculated the speed of evolution of genes in these gene regulatory networks.

View Article and Find Full Text PDF

We investigate the selective forces that promote the emergence of modularity in nature. We demonstrate the spontaneous emergence of modularity in a population of individuals that evolve in a changing environment. We show that the level of modularity correlates with the rapidity and severity of environmental change.

View Article and Find Full Text PDF