Introduction: Accurate genotyping of Killer cell Immunoglobulin-like Receptor (KIR) genes plays a pivotal role in enhancing our understanding of innate immune responses, disease correlations, and the advancement of personalized medicine. However, due to the high variability of the KIR region and high level of sequence similarity among different KIR genes, the generic genotyping workflows are unable to accurately infer copy numbers and complete genotypes of individual KIR genes from next-generation sequencing data. Thus, specialized genotyping tools are needed to genotype this complex region.
View Article and Find Full Text PDFWe found 18 patients with immunophenotype consistent with classic hairy cell leukemia (HCL) and BRAF mutations other than just V600E. Twelve had 1 non-V600E BRAF mutation and 6 had V600E with 1 (n=5) or 2 (n=1) non-V600E BRAF co-mutations.
View Article and Find Full Text PDFCancer progression is an evolutionary process driven by the selection of cells adapted to gain growth advantage. We present a formal study on the adaptation of gene expression in subclonal evolution. We model evolutionary changes in gene expression as stochastic Ornstein-Uhlenbeck processes, jointly leveraging the evolutionary history of subclones and single-cell expression data.
View Article and Find Full Text PDFSummary: Natural killer (NK) cells are essential components of the innate immune system, with their activity significantly regulated by Killer cell Immunoglobulin-like Receptors (KIRs). The diversity and structural complexity of KIR genes present significant challenges for accurate genotyping, essential for understanding NK cell functions and their implications in health and disease. Traditional genotyping methods struggle with the variable nature of KIR genes, leading to inaccuracies that can impede immunogenetic research.
View Article and Find Full Text PDFNatural killer (NK) cells are essential components of the innate immune system, with their activity significantly regulated by Killer cell Immunoglobulin-like Receptors (KIRs). The diversity and structural complexity of KIR genes present significant challenges for accurate genotyping, essential for understanding NK cell functions and their implications in health and disease. Traditional genotyping methods struggle with the variable nature of KIR genes, leading to inaccuracies that can impede immunogenetic research.
View Article and Find Full Text PDFThe introduction of trusted execution environments (TEEs), such as secure enclaves provided by the Intel SGX technology has enabled secure and privacy-preserving computation on the cloud. The stringent resource limitations, such as memory constraints, required by some TEEs necessitates the development of computational approaches with reduced memory usage, such as sketching. One example is the SkSES method for GWAS on a cohort of case and control samples from multiple institutions, which identifies the most significant SNPs in a privacy-preserving manner without disclosing sensitive genotype information to other institutions or the cloud service provider.
View Article and Find Full Text PDFThe study of the tumor microbiome has been garnering increased attention. We developed a computational pipeline (CSI-Microbes) for identifying microbial reads from single-cell RNA sequencing (scRNA-seq) data and for analyzing differential abundance of taxa. Using a series of controlled experiments and analyses, we performed the first systematic evaluation of the efficacy of recovering microbial unique molecular identifiers by multiple scRNA-seq technologies, which identified the newer 10x chemistries (3' v3 and 5') as the best suited approach.
View Article and Find Full Text PDFThis Voices piece will highlight the impact of artificial intelligence on algorithm development among computational biologists. How has worldwide focus on AI changed the path of research in computational biology? What is the impact on the algorithmic biology research community?
View Article and Find Full Text PDFMost current studies rely on short-read sequencing to detect somatic structural variation (SV) in cancer genomes. Long-read sequencing offers the advantage of better mappability and long-range phasing, which results in substantial improvements in germline SV detection. However, current long-read SV detection methods do not generalize well to the analysis of somatic SVs in tumor genomes with complex rearrangements, heterogeneity, and aneuploidy.
View Article and Find Full Text PDFAccurate genotyping of Killer cell Immunoglobulin-like Receptor (KIR) genes plays a pivotal role in enhancing our understanding of innate immune responses, disease correlations, and the advancement of personalized medicine. However, due to the high variability of the KIR region and high level of sequence similarity among different KIR genes, the currently available genotyping methods are unable to accurately infer copy numbers, genotypes and haplotypes of individual KIR genes from next-generation sequencing data. Here we introduce Geny, a new computational tool for precise genotyping of KIR genes.
View Article and Find Full Text PDFBackground: Next-generation sequencing (NGS), including whole genome sequencing (WGS) and whole exome sequencing (WES), is increasingly being used for clinic care. While NGS data have the potential to be repurposed to support clinical pharmacogenomics (PGx), current computational approaches have not been widely validated using clinical data. In this study, we assessed the accuracy of the Aldy computational method to extract PGx genotypes from WGS and WES data for 14 and 13 major pharmacogenes, respectively.
View Article and Find Full Text PDFIntratumoral heterogeneity (ITH) can promote cancer progression and treatment failure, but the complexity of the regulatory programs and contextual factors involved complicates its study. To understand the specific contribution of ITH to immune checkpoint blockade (ICB) response, we generated single cell-derived clonal sublines from an ICB-sensitive and genetically and phenotypically heterogeneous mouse melanoma model, M4. Genomic and single cell transcriptomic analyses uncovered the diversity of the sublines and evidenced their plasticity.
View Article and Find Full Text PDFRecent studies exploring the impact of methylation in tumor evolution suggest that although the methylation status of many of the CpG sites are preserved across distinct lineages, others are altered as the cancer progresses. Because changes in methylation status of a CpG site may be retained in mitosis, they could be used to infer the progression history of a tumor via single-cell lineage tree reconstruction. In this work, we introduce the first principled distance-based computational method, Sgootr, for inferring a tumor's single-cell methylation lineage tree and for jointly identifying lineage-informative CpG sites that harbor changes in methylation status that are retained along the lineage.
View Article and Find Full Text PDFHigh-throughput sequencing provides sufficient means for determining genotypes of clinically important pharmacogenes that can be used to tailor medical decisions to individual patients. However, pharmacogene genotyping, also known as star-allele calling, is a challenging problem that requires accurate copy number calling, structural variation identification, variant calling, and phasing within each pharmacogene copy present in the sample. Here we introduce Aldy 4, a fast and efficient tool for genotyping pharmacogenes that uses combinatorial optimization for accurate star-allele calling across different sequencing technologies.
View Article and Find Full Text PDFComputational identification and quantification of distinct microbes from high throughput sequencing data is crucial for our understanding of human health. Existing methods either use accurate but computationally expensive alignment-based approaches or less accurate but computationally fast alignment-free approaches, which often fail to correctly assign reads to genomes. Here we introduce CAMMiQ, a combinatorial optimization framework to identify and quantify distinct genomes (specified by a database) in a metagenomic dataset.
View Article and Find Full Text PDFHuman immunoglobulin heavy chain (IGH) locus on chromosome 14 includes more than 40 functional copies of the variable gene (IGHV), which are critical for the structure of antibodies that identify and neutralize pathogenic invaders as a part of the adaptive immune system. Because of its highly repetitive sequence composition, the IGH locus has been particularly difficult to assemble or genotype when using standard short-read sequencing technologies. Here, we introduce ImmunoTyper-SR, an algorithmic tool for the genotyping and CNV analysis of the germline IGHV genes on Illumina whole-genome sequencing (WGS) data using a combinatorial optimization formulation that resolves ambiguous read mappings.
View Article and Find Full Text PDFHistorically, the primary focus of cancer research has been molecular and clinical studies of a few essential pathways and genes. Recent years have seen the rapid accumulation of large-scale cancer omics data catalysed by breakthroughs in high-throughput technologies. This fast data growth has given rise to an evolving concept of 'big data' in cancer, whose analysis demands large computational resources and can potentially bring novel insights into essential questions.
View Article and Find Full Text PDFGermline whole exome sequencing from molecular tumor boards has the potential to be repurposed to support clinical pharmacogenomics. However, accurately calling pharmacogenomics-relevant genotypes from exome sequencing data remains challenging. Accordingly, this study assessed the analytical validity of the computational tool, Aldy, in calling pharmacogenomics-relevant genotypes from exome sequencing data for 13 major pharmacogenes.
View Article and Find Full Text PDFPharmacogenetic tests typically target selected sequence variants to identify haplotypes that are often defined by star (∗) allele nomenclature. Due to their design, these targeted genotyping assays are unable to detect novel variants that may change the function of the gene product and thereby affect phenotype prediction and patient care. In the current study, 137 DNA samples that were previously characterized by the Genetic Testing Reference Material (GeT-RM) program using a variety of targeted genotyping methods were recharacterized using targeted and whole genome sequencing analysis.
View Article and Find Full Text PDFHuman immunoglobulin heavy chain (IGH) locus on chromosome 14 includes more than 40 functional copies of the variable gene (IGHV), which, together with the joining genes (IGHJ), diversity genes (IGHD), constant genes (IGHC) and immunoglobulin light chains, code for antibodies that identify and neutralize pathogenic invaders as a part of the adaptive immune system. Because of its highly repetitive sequence composition, the IGH locus has been particularly difficult to assemble or genotype through the use of standard short read sequencing technologies. Here we introduce ImmunoTyper-SR, an algorithmic method for genotype and CNV analysis of the germline IGHV genes using Illumina whole genome sequencing (WGS) data.
View Article and Find Full Text PDFCancer results from an evolutionary process that yields a heterogeneous tumor with distinct subpopulations and varying sets of somatic mutations. This perspective discusses computational methods to infer models of evolutionary processes in cancer that aim to improve our understanding of tumorigenesis and ultimately enhance current clinical practice.
View Article and Find Full Text PDF