Small cell lung cancer (SCLC) is an aggressive neuroendocrine tumor characterized by a high metastatic potential with an overall survival rate of ~5%. The transcription factor signal transducer and activator of transcription 3 (STAT3) is overexpressed by >50% of tumors, including SCLC, but its role in SCLC development and metastasis is unclear. Here, we show that, while STAT3 deletion restricts primary tumor growth, it paradoxically enhances metastatic spread by promoting immune evasion.
View Article and Find Full Text PDFThree billion years of evolution has produced a tremendous diversity of protein molecules, but the full potential of proteins is likely to be much greater. Accessing this potential has been challenging for both computation and experiments because the space of possible protein molecules is much larger than the space of those likely to have functions. Here we introduce Chroma, a generative model for proteins and protein complexes that can directly sample novel protein structures and sequences, and that can be conditioned to steer the generative process towards desired properties and functions.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
October 2018
Many applications in protein engineering require optimizing multiple protein properties simultaneously, such as binding one target but not others or binding a target while maintaining stability. Such multistate design problems require navigating a high-dimensional space to find proteins with desired characteristics. A model that relates protein sequence to functional attributes can guide design to solutions that would be hard to discover via screening.
View Article and Find Full Text PDFRapid accumulation and availability of gene expression datasets in public repositories have enabled large-scale meta-analyses of combined data. The richness of cross-experiment data has provided new biological insights, including identification of new cancer genes. In this study, we compiled a human gene expression dataset from ∼40,000 publicly available Affymetrix HG-U133Plus2 arrays.
View Article and Find Full Text PDFBackground: It is a great challenge of modern biology to determine the functional roles of non-synonymous Single Nucleotide Polymorphisms (nsSNPs) on complex phenotypes. Statistical and machine learning techniques establish correlations between genotype and phenotype, but may fail to infer the biologically relevant mechanisms. The emerging paradigm of Network-based Association Studies aims to address this problem of statistical analysis.
View Article and Find Full Text PDFMotivation: Meta-analysis of large gene expression datasets obtained from public repositories requires consistently annotated data. Curation of such experiments, however, is an expert activity which involves repetitive manipulation of text. Existing tools for automated curation are few, which bottleneck the analysis pipeline.
View Article and Find Full Text PDFHow genomic diversity within bacterial populations originates and is maintained in the presence of frequent recombination is a central problem in understanding bacterial evolution. Natural populations of Borrelia burgdorferi, the bacterial agent of Lyme disease, consist of diverse genomic groups co-infecting single individual vertebrate hosts and tick vectors. To understand mechanisms of sympatric genome differentiation in B.
View Article and Find Full Text PDF