Publications by authors named "Hugo Y K Lam"

Motivation: While many quantum computing (QC) methods promise theoretical advantages over classical counterparts, quantum hardware remains limited. Exploiting near-term QC in computer-aided drug design (CADD) thus requires judicious partitioning between classical and quantum calculations.

Results: We present HypaCADD, a hybrid classical-quantum workflow for finding ligands binding to proteins, while accounting for genetic mutations.

View Article and Find Full Text PDF

Objectives: We examined for differences in pre-left ventricular assist device (LVAD) implantation myocardial transcriptome signatures among patients with different degrees of mitral regurgitation (MR).

Methods: Between January 2018 and October 2019, we collected left ventricular (LV) cores during durable LVAD implantation (n = 72). A retrospective chart review was performed.

View Article and Find Full Text PDF

The lack of samples for generating standardized DNA datasets for setting up a sequencing pipeline or benchmarking the performance of different algorithms limits the implementation and uptake of cancer genomics. Here, we describe reference call sets obtained from paired tumor-normal genomic DNA (gDNA) samples derived from a breast cancer cell line-which is highly heterogeneous, with an aneuploid genome, and enriched in somatic alterations-and a matched lymphoblastoid cell line. We partially validated both somatic mutations and germline variants in these call sets via whole-exome sequencing (WES) with different sequencing platforms and targeted sequencing with >2,000-fold coverage, spanning 82% of genomic regions with high confidence.

View Article and Find Full Text PDF

Background: Dysfunction and inflammation of hearts subjected to cold ischemic preservation may differ between left and right ventricles, suggesting distinct strategies for amelioration.

Methods And Results: Explanted murine hearts subjected to cold ischemia for 0, 4, or 8 h in preservation solution were assessed for function during 60 min of warm perfusion and then analyzed for cell death and inflammation by immunohistochemistry and western blotting and total RNA sequencing. Increased cold ischemic times led to greater left ventricle (LV) dysfunction compared to right ventricle (RV).

View Article and Find Full Text PDF

Background: Ischemic tolerance of donor hearts has a major impact on the efficiency in utilization and clinical outcomes. Molecular events during storage may influence the severity of ischemic injury.

Methods: RNA sequencing was used to study the transcriptional profile of the human left ventricle (LV, n=4) and right ventricle (RV, n=4) after 0, 4, and 8 hours of cold storage in histidine-tryptophan-ketoglutarate preservation solution.

View Article and Find Full Text PDF

Tumor Mutational Burden (TMB) is a measure of the abundance of somatic mutations in a tumor, which has been shown to be an emerging biomarker for both anti-PD-(L)1 treatment and prognosis; however, multiple challenges still hinder the adoption of TMB as a biomarker. The key challenges are the inconsistency of tumor mutational burden measurement among assays and the lack of a meaningful threshold for TMB classification. Here we describe a new method, ecTMB (Estimation and Classification of TMB), which uses an explicit background mutation model to predict TMB robustly and to classify samples into biologically meaningful subtypes defined by tumor mutational burden.

View Article and Find Full Text PDF

Accurate detection of somatic mutations is still a challenge in cancer analysis. Here we present NeuSomatic, the first convolutional neural network approach for somatic mutation detection, which significantly outperforms previous methods on different sequencing platforms, sequencing strategies, and tumor purities. NeuSomatic summarizes sequence alignments into small matrices and incorporates more than a hundred features to capture mutation signals effectively.

View Article and Find Full Text PDF
Article Synopsis
  • The Atacama humanoid skeleton, known as Ata, was found in Chile over a decade ago and exhibited unusual physical traits, leading to speculations about its origin as a nonhuman primate, a mutated human fetus, or even an extraterrestrial being.
  • DNA analysis confirmed that Ata was human, specifically a female of likely Chilean ancestry, with a bone age estimated to be between 6-8 years at the time of death.
  • Whole-genome sequencing revealed numerous genetic mutations linked to conditions affecting small stature and skeletal development, suggesting that Ata's unique physical characteristics arise from a combination of known and novel gene mutations.
View Article and Find Full Text PDF

The human genome is generally organized into stable chromosomes, and only tumor cells are known to accumulate kilobase (kb)-sized extrachromosomal circular DNA elements (eccDNAs). However, it must be expected that kb eccDNAs exist in normal cells as a result of mutations. Here, we purify and sequence eccDNAs from muscle and blood samples from 16 healthy men, detecting ~100,000 unique eccDNA types from 16 million nuclei.

View Article and Find Full Text PDF

RNA-sequencing (RNA-seq) is an essential technique for transcriptome studies, hundreds of analysis tools have been developed since it was debuted. Although recent efforts have attempted to assess the latest available tools, they have not evaluated the analysis workflows comprehensively to unleash the power within RNA-seq. Here we conduct an extensive study analysing a broad spectrum of RNA-seq workflows.

View Article and Find Full Text PDF

The CAGI-4 Hopkins clinical panel challenge was an attempt to assess state-of-the-art methods for clinical phenotype prediction from DNA sequence. Participants were provided with exonic sequences of 83 genes for 106 patients from the Johns Hopkins DNA Diagnostic Laboratory. Five groups participated in the challenge, predicting both the probability that each patient had each of the 14 possible classes of disease, as well as one or more causal variants.

View Article and Find Full Text PDF

Unlabelled: LongISLND is a software package designed to simulate sequencing data according to the characteristics of third generation, single-molecule sequencing technologies. The general software architecture is easily extendable, as demonstrated by the emulation of Pacific Biosciences (PacBio) multi-pass sequencing with P5 and P6 chemistries, producing data in FASTQ, H5, and the latest PacBio BAM format. We demonstrate its utility by downstream processing with consensus building and variant calling.

View Article and Find Full Text PDF

Background: The human genome contains variants ranging in size from small single nucleotide polymorphisms (SNPs) to large structural variants (SVs). High-quality benchmark small variant calls for the pilot National Institute of Standards and Technology (NIST) Reference Material (NA12878) have been developed by the Genome in a Bottle Consortium, but no similar high-quality benchmark SV calls exist for this genome. Since SV callers output highly discordant results, we developed methods to combine multiple forms of evidence from multiple sequencing technologies to classify candidate SVs into likely true or false positives.

View Article and Find Full Text PDF

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes.

View Article and Find Full Text PDF

A high-confidence, comprehensive human variant set is critical in assessing accuracy of sequencing algorithms, which are crucial in precision medicine based on high-throughput sequencing. Although recent works have attempted to provide such a resource, they still do not encompass all major types of variants including structural variants (SVs). Thus, we leveraged the massive high-quality Sanger sequences from the HuRef genome to construct by far the most comprehensive gold set of a single individual, which was cross validated with deep Illumina sequencing, population datasets, and well-established algorithms.

View Article and Find Full Text PDF

SomaticSeq is an accurate somatic mutation detection pipeline implementing a stochastic boosting algorithm to produce highly accurate somatic mutation calls for both single nucleotide variants and small insertions and deletions. The workflow currently incorporates five state-of-the-art somatic mutation callers, and extracts over 70 individual genomic and sequencing features for each candidate site. A training set is provided to an adaptively boosted decision tree learner to create a classifier for predicting mutation statuses.

View Article and Find Full Text PDF

Investigating genomic structural variants at basepair resolution is crucial for understanding their formation mechanisms. We identify and analyse 8,943 deletion breakpoints in 1,092 samples from the 1000 Genomes Project. We find breakpoints have more nearby SNPs and indels than the genomic average, likely a consequence of relaxed selection.

View Article and Find Full Text PDF

Unlabelled: Structural variations (SVs) are large genomic rearrangements that vary significantly in size, making them challenging to detect with the relatively short reads from next-generation sequencing (NGS). Different SV detection methods have been developed; however, each is limited to specific kinds of SVs with varying accuracy and resolution. Previous works have attempted to combine different methods, but they still suffer from poor accuracy particularly for insertions.

View Article and Find Full Text PDF

Summary: VarSim is a framework for assessing alignment and variant calling accuracy in high-throughput genome sequencing through simulation or real data. In contrast to simulating a random mutation spectrum, it synthesizes diploid genomes with germline and somatic mutations based on a realistic model. This model leverages information such as previously reported mutations to make the synthetic genomes biologically relevant.

View Article and Find Full Text PDF

Personalized medicine is expected to benefit from combining genomic information with regular monitoring of physiological states by multiple high-throughput methods. Here, we present an integrative personal omics profile (iPOP), an analysis that combines genomic, transcriptomic, proteomic, metabolomic, and autoantibody profiles from a single individual over a 14 month period. Our iPOP analysis revealed various medical risks, including type 2 diabetes.

View Article and Find Full Text PDF

Whole-genome sequencing is becoming commonplace, but the accuracy and completeness of variant calling by the most widely used platforms from Illumina and Complete Genomics have not been reported. Here we sequenced the genome of an individual with both technologies to a high average coverage of ∼76×, and compared their performance with respect to sequence coverage and calling of single-nucleotide variants (SNVs), insertions and deletions (indels). Although 88.

View Article and Find Full Text PDF

Whole exome sequencing by high-throughput sequencing of target-enriched genomic DNA (exome-seq) has become common in basic and translational research as a means of interrogating the interpretable part of the human genome at relatively low cost. We present a comparison of three major commercial exome sequencing platforms from Agilent, Illumina and Nimblegen applied to the same human blood sample. Our results suggest that the Nimblegen platform, which is the only one to use high-density overlapping baits, covers fewer genomic regions than the other platforms but requires the least amount of sequencing to sensitively detect small variants.

View Article and Find Full Text PDF

As a consequence of the accumulation of insertion events over evolutionary time, mobile elements now comprise nearly half of the human genome. The Alu, L1, and SVA mobile element families are still duplicating, generating variation between individual genomes. Mobile element insertions (MEI) have been identified as causes for genetic diseases, including hemophilia, neurofibromatosis, and various cancers.

View Article and Find Full Text PDF