Publications by authors named "Lorin Crawford"

High-throughput phenotypic screens using biochemical perturbations and high-content readouts are constrained by limitations of scale. To address this, we establish a method of pooling exogenous perturbations followed by computational deconvolution to reduce required sample size, labor and cost. We demonstrate the increased efficiency of compressed experimental designs compared to conventional approaches through benchmarking with a bioactive small-molecule library and a high-content imaging readout.

View Article and Find Full Text PDF
Article Synopsis
  • Researchers studied 9,902 SARS-CoV-2 infections over two years to understand how genetic variations in the virus and factors like host age and vaccination status affect viral copies.
  • They used a genome-wide association study (GWAS) to find specific genetic changes (SNPs) in the virus correlated with higher or lower viral copies, particularly noting interactions between these SNPs.
  • The study revealed that SNPs linked to higher viral loads were mainly seen in Delta and Omicron variants, while those linked to lower loads were found in Omicron BA.1, suggesting the potential for GWAS to analyze other pathogens and their variants.
View Article and Find Full Text PDF

In this paper, we aim to build a platform that will help bridge the gap between high-dimensional computation and wet-lab experimentation by allowing users to interrogate genomic signatures at multiple molecular levels and identify best next actionable steps for downstream decision making. We introduce Multioviz: a publicly accessible R package and web application platform to easily perform in silico hypothesis testing of generated gene regulatory networks. We demonstrate the utility of Multioviz by conducting an end-to-end analysis in a statistical genetics application focused on measuring the effect of in silico perturbations of complex trait architecture.

View Article and Find Full Text PDF

Efforts to cure BCR::ABL1 B cell acute lymphoblastic leukemia (Ph+ ALL) solely through inhibition of ABL1 kinase activity have thus far been insufficient despite the availability of tyrosine kinase inhibitors (TKIs) with broad activity against resistance mutants. The mechanisms that drive persistence within minimal residual disease (MRD) remain poorly understood and therefore untargeted. Utilizing 13 patient-derived xenograft (PDX) models and clinical trial specimens of Ph+ ALL, we examined how genetic and transcriptional features co-evolve to drive progression during prolonged TKI response.

View Article and Find Full Text PDF

LD score regression (LDSC) is a method to estimate narrow-sense heritability from genome-wide association study (GWAS) summary statistics alone, making it a fast and popular approach. In this work, we present interaction-LD score (i-LDSC) regression: an extension of the original LDSC framework that accounts for interactions between genetic variants. By studying a wide range of generative models in simulations, and by re-analyzing 25 well-studied quantitative phenotypes from 349,468 individuals in the UK Biobank and up to 159,095 individuals in BioBank Japan, we show that the inclusion of a -interaction score (i.

View Article and Find Full Text PDF

Clustering is commonly used in single-cell RNA-sequencing (scRNA-seq) pipelines to characterize cellular heterogeneity. However, current methods face two main limitations. First, they require user-specified heuristics which add time and complexity to bioinformatic workflows; second, they rely on post-selective differential expression analyses to identify marker genes driving cluster differences, which has been shown to be subject to inflated false discovery rates.

View Article and Find Full Text PDF

Understanding morphological variation is an important task in many areas of computational biology. Recent studies have focused on developing computational tools for the task of sub-image selection which aims at identifying structural features that best describe the variation between classes of shapes. A major part in assessing the utility of these approaches is to demonstrate their performance on both simulated and real datasets.

View Article and Find Full Text PDF

Epistasis, commonly defined as the interaction between genetic loci, is known to play an important role in the phenotypic variation of complex traits. As a result, many statistical methods have been developed to identify genetic variants that are involved in epistasis, and nearly all of these approaches carry out this task by focusing on analyzing one trait at a time. Previous studies have shown that jointly modeling multiple phenotypes can often dramatically increase statistical power for association mapping.

View Article and Find Full Text PDF

Natural products are chemical compounds that form the basis of many therapeutics used in the pharmaceutical industry. In microbes, natural products are synthesized by groups of colocalized genes called biosynthetic gene clusters (BGCs). With advances in high-throughput sequencing, there has been an increase of complete microbial isolate genomes and metagenomes, from which a vast number of BGCs are undiscovered.

View Article and Find Full Text PDF

In this paper, we propose a new approach for variable selection using a collection of Bayesian neural networks with a focus on quantifying uncertainty over which variables are selected. Motivated by fine-mapping applications in statistical genetics, we refer to our framework as an "ensemble of single-effect neural networks" (ESNN) which generalizes the "sum of single effects" regression framework by both accounting for nonlinear structure in genotypic data (e.g.

View Article and Find Full Text PDF

Identifying structural differences among proteins can be a non-trivial task. When contrasting ensembles of protein structures obtained from molecular dynamics simulations, biologically-relevant features can be easily overshadowed by spurious fluctuations. Here, we present SINATRA Pro, a computational pipeline designed to robustly identify topological differences between two sets of protein structures.

View Article and Find Full Text PDF

Since 2005, genome-wide association (GWA) datasets have been largely biased toward sampling European ancestry individuals, and recent studies have shown that GWA results estimated from self-identified European individuals are not transferable to non-European individuals because of various confounding challenges. Here, we demonstrate that enrichment analyses that aggregate SNP-level association statistics at multiple genomic scales-from genes to genomic regions and pathways-have been underutilized in the GWA era and can generate biologically interpretable hypotheses regarding the genetic basis of complex trait architecture. We illustrate examples of the robust associations generated by enrichment analyses while studying 25 continuous traits assayed in 566,786 individuals from seven diverse self-identified human ancestries in the UK Biobank and the Biobank Japan as well as 44,348 admixed individuals from the PAGE consortium including cohorts of African American, Hispanic and Latin American, Native Hawaiian, and American Indian/Alaska Native individuals.

View Article and Find Full Text PDF

Prognostically relevant RNA expression states exist in pancreatic ductal adenocarcinoma (PDAC), but our understanding of their drivers, stability, and relationship to therapeutic response is limited. To examine these attributes systematically, we profiled metastatic biopsies and matched organoid models at single-cell resolution. In vivo, we identify a new intermediate PDAC transcriptional cell state and uncover distinct site- and state-specific tumor microenvironments (TMEs).

View Article and Find Full Text PDF

In this article, we present Biologically Annotated Neural Networks (BANNs), a nonlinear probabilistic framework for association mapping in genome-wide association (GWA) studies. BANNs are feedforward models with partially connected architectures that are based on biological annotations. This setup yields a fully interpretable neural network where the input layer encodes SNP-level effects, and the hidden layer models the aggregated effects among SNP-sets.

View Article and Find Full Text PDF

Large-scale phenotype data can enhance the power of genomic prediction in plant and animal breeding, as well as human genetics. However, the statistical foundation of multi-trait genomic prediction is based on the multivariate linear mixed effect model, a tool notorious for its fragility when applied to more than a handful of traits. We present MegaLMM, a statistical framework and associated software package for mixed model analyses of a virtually unlimited number of traits.

View Article and Find Full Text PDF

The winged insects of the order Diptera are colloquially named for their most recognizable phenotype: flight. These insects rely on flight for a number of important life history traits, such as dispersal, foraging, and courtship. Despite the importance of flight, relatively little is known about the genetic architecture of flight performance.

View Article and Find Full Text PDF

Glycogen synthase kinase-3β (GSK-3β), a serine/threonine kinase, has been implicated in the pathogenesis of many cancers, with involvement in cell-cycle regulation, apoptosis, and immune response. Small-molecule GSK-3β inhibitors are currently undergoing clinical investigation. Tumor sequencing has revealed genomic alterations in , yet an assessment of the genomic landscape in malignancies is lacking.

View Article and Find Full Text PDF

The central aim in this paper is to address variable selection questions in nonlinear and nonparametric regression. Motivated by statistical genetics, where nonlinear interactions are of particular interest, we introduce a novel and interpretable way to summarize the relative importance of predictor variables. Methodologically, we develop the "RelATive cEntrality" (RATE) measure to prioritize candidate genetic variants that are not just marginally important, but whose associations also stem from significant covarying relationships with other variants in the data.

View Article and Find Full Text PDF

Traditional univariate genome-wide association studies generate false positives and negatives due to difficulties distinguishing associated variants from variants with spurious nonzero effects that do not directly influence the trait. Recent efforts have been directed at identifying genes or signaling pathways enriched for mutations in quantitative traits or case-control studies, but these can be computationally costly and hampered by strict model assumptions. Here, we present gene-ε, a new approach for identifying statistical associations between sets of variants and quantitative traits.

View Article and Find Full Text PDF

Here we demonstrate a technique to generate proteomic signatures of specific cell types within heterogeneous populations. While our method is broadly applicable across biological systems, we have limited the current work to study neural cell types isolated from human, post-mortem Alzheimer's disease (AD) and aged-matched non-symptomatic (NS) brains. Motivating the need for this tool, we conducted an initial meta-analysis of current, human AD proteomics studies.

View Article and Find Full Text PDF
Article Synopsis
  • Local adaptation helps organisms survive in specific environments by developing useful traits, but fast changes in the environment can sometimes create problems instead of solutions.
  • Researchers studied how cancer cells, specifically acute myeloid leukemia (AML), change when exposed to different treatments, revealing a complicated relationship between drug resistance and sensitivity.
  • They discovered that certain genetic pathways in these cells can make them resistant to one type of treatment while making them more vulnerable to another, which could help develop better cancer therapies that target this weakness.
View Article and Find Full Text PDF

Many different subpopulations of subcellular extracellular vesicles (EVs) have been described. EVs are released from all cell types and have been shown to regulate normal physiological homeostasis, as well as pathological states by influencing cell proliferation, differentiation, organ homing, injury and recovery, as well as disease progression. In this review, we focus on the bidirectional actions of vesicles from normal and diseased cells on normal or leukemic target cells; and on the leukemic microenvironment as a whole.

View Article and Find Full Text PDF

Nonlinear kernel regression models are often used in statistics and machine learning because they are more accurate than linear models. Variable selection for kernel regression models is a challenge partly because, unlike the linear regression setting, there is no clear concept of an effect size for regression coefficients. In this paper, we propose a novel framework that provides an effect size analog for each explanatory variable in Bayesian kernel regression models when the kernel is shift-invariant - for example, the Gaussian kernel.

View Article and Find Full Text PDF

Linear mixed effect models are powerful tools used to account for population structure in genome-wide association studies (GWASs) and estimate the genetic architecture of complex traits. However, fully-specified models are computationally demanding and common simplifications often lead to reduced power or biased inference. We describe Grid-LMM (https://github.

View Article and Find Full Text PDF