Publications by authors named "Bahar Alipanahi"

Circulating cell-free DNA (cfDNA) assays for monitoring individuals with cancer typically rely on prior identification of tumor-specific mutations. Here, we develop a tumor-independent and mutation-independent approach (DELFI-tumor fraction, DELFI-TF) using low-coverage whole genome sequencing to determine the cfDNA tumor fraction and validate the method in two independent cohorts of patients with colorectal or lung cancer. DELFI-TF scores strongly correlate with circulating tumor DNA levels (ctDNA) (r = 0.

View Article and Find Full Text PDF

Motivation: The de Bruijn graph has become a ubiquitous graph model for biological data ever since its initial introduction in the late 1990s. It has been used for a variety of purposes including genome assembly (Zerbino and Birney, 2008; Bankevich et al., 2012; Peng et al.

View Article and Find Full Text PDF

Motivation: The de Bruijn graph is one of the fundamental data structures for analysis of high throughput sequencing data. In order to be applicable to population-scale studies, it is essential to build and store the graph in a space- and time-efficient manner. In addition, due to the ever-changing nature of population studies, it has become essential to update the graph after construction, e.

View Article and Find Full Text PDF

Motivation: Metagenomics refers to the study of complex samples containing of genetic contents of multiple individual organisms and, thus, has been used to elucidate the microbiome and resistome of a complex sample. The microbiome refers to all microbial organisms in a sample, and the resistome refers to all of the antimicrobial resistance (AMR) genes in pathogenic and non-pathogenic bacteria. Single-nucleotide polymorphisms (SNPs) can be effectively used to 'fingerprint' specific organisms and genes within the microbiome and resistome and trace their movement across various samples.

View Article and Find Full Text PDF

Motivation: There exist several large genomic and metagenomic data collection efforts, including GenomeTrakr and MetaSub, which are routinely updated with new data. To analyze such datasets, memory-efficient methods to construct and store the colored de Bruijn graph were developed. Yet, a problem that has not been considered is constructing the colored de Bruijn graph in a scalable manner that allows new data to be added without reconstruction.

View Article and Find Full Text PDF

The characterization of antimicrobial resistance genes from high-throughput sequencing data has become foundational in public health research and regulation. This requires mapping sequence reads to databases of known antimicrobial resistance genes to determine the genes present in the sample. Mapping sequence reads to known genes is traditionally accomplished using alignment.

View Article and Find Full Text PDF

Motivation: Optical maps are high-resolution restriction maps (Rmaps) that give a unique numeric representation to a genome. Used in concert with sequence reads, they provide a useful tool for genome assembly and for discovering structural variations and rearrangements. Although they have been a regular feature of modern genome assembly projects, optical maps have been mainly used in post-processing step and not in the genome assembly process itself.

View Article and Find Full Text PDF