Publications by authors named "Matthew Ruffalo"

Identifying cell types in highly multiplexed images is essential for understanding tissue spatial organization. Current cell type annotation methods often rely on extensive reference images and manual adjustments. In this work, we present a tool, Robust Image-Based Cell Annotator (RIBCA), that enables accurate, automated, unbiased, and fine-grained cell type annotation for images with a wide range of antibody panels, without requiring additional model training or human intervention.

View Article and Find Full Text PDF

The Human BioMolecular Atlas Program (HuBMAP) aims to construct a reference 3D structural, cellular, and molecular atlas of the healthy adult human body. The HuBMAP Data Portal (https://portal.hubmapconsortium.

View Article and Find Full Text PDF

Motivation: Spatial proteomics data have been used to map cell states and improve our understanding of tissue organization. More recently, these methods have been extended to study the impact of such organization on disease progression and patient survival. However, to date, the majority of supervised learning methods utilizing these data types did not take full advantage of the spatial information, impacting their performance and utilization.

View Article and Find Full Text PDF

Inference of global gene regulatory networks from omics data is a long-term goal of systems biology. Most methods developed for inferring transcription factor (TF)-gene interactions either relied on a small dataset or used snapshot data which is not suitable for inferring a process that is inherently temporal. Here, we developed a new computational method that combines neural networks and multi-task learning to predict RNA velocity rather than gene expression values.

View Article and Find Full Text PDF

Studies comparing single cell RNA-Seq (scRNA-Seq) data between conditions mainly focus on differences in the proportion of cell types or on differentially expressed genes. In many cases these differences are driven by changes in cell interactions which are challenging to infer without spatial information. To determine cell-cell interactions that differ between conditions we developed the Cell Interaction Network Inference (CINS) pipeline.

View Article and Find Full Text PDF

Background: Most methods that integrate network and mutation data to study cancer focus on the effects of genes/proteins, quantifying the effect of mutations or differential expression of a gene and its neighbors, or identifying groups of genes that are significantly up- or down-regulated. However, several mutations are known to disrupt specific protein-protein interactions, and network dynamics are often ignored by such methods. Here we introduce a method that allows for predicting the disruption of specific interactions in cancer patients using somatic mutation data and protein interaction networks.

View Article and Find Full Text PDF

Prediction of response to specific cancer treatments is complicated by significant heterogeneity between tumors in terms of mutational profiles, gene expression, and clinical measures. Here we focus on the response of Estrogen Receptor (ER)+ post-menopausal breast cancer tumors to aromatase inhibitors (AI). We use a network smoothing algorithm to learn novel features that integrate several types of high throughput data and new cell line experiments.

View Article and Find Full Text PDF

Single cell RNA-Seq (scRNA-seq) studies profile thousands of cells in heterogeneous environments. Current methods for characterizing cells perform unsupervised analysis followed by assignment using a small set of known marker genes. Such approaches are limited to a few, well characterized cell types.

View Article and Find Full Text PDF

In the tumor microenvironment, immune cells have emerged as key regulators of cancer progression. While much work has focused on characterizing tumor-related immune cells through gene expression profiling, microRNAs (miRNAs) have also been reported to regulate immune cells in the tumor microenvironment. Using regression-based computational methods, we have constructed for the first time, immune cell signatures based on miRNA expression from The Cancer Genome Atlas breast and ovarian cancer datasets.

View Article and Find Full Text PDF

Background: Translating in vitro results to clinical tests is a major challenge in systems biology. Here we present a new Multi-Task learning framework which integrates thousands of cell line expression experiments to reconstruct drug specific response networks in cancer.

Results: The reconstructed networks correctly identify several shared key proteins and pathways while simultaneously highlighting many cell type specific proteins.

View Article and Find Full Text PDF

Motivation: Reconstructing regulatory networks from expression and interaction data is a major goal of systems biology. While much work has focused on trying to experimentally and computationally determine the set of transcription-factors (TFs) and microRNAs (miRNAs) that regulate genes in these networks, relatively little work has focused on inferring the regulation of miRNAs by TFs. Such regulation can play an important role in several biological processes including development and disease.

View Article and Find Full Text PDF

Development of high-throughput monitoring technologies enables interrogation of cancer samples at various levels of cellular activity. Capitalizing on these developments, various public efforts such as The Cancer Genome Atlas (TCGA) generate disparate omic data for large patient cohorts. As demonstrated by recent studies, these heterogeneous data sources provide the opportunity to gain insights into the molecular changes that drive cancer pathogenesis and progression.

View Article and Find Full Text PDF

Purpose: To date the standard nosology and prognostic schemes for myeloid neoplasms have been based on morphologic and cytogenetic criteria. We sought to test the hypothesis that a comprehensive, unbiased analysis of somatic mutations may allow for an improved classification of these diseases to predict outcome (overall survival).

Experimental Design: We performed whole-exome sequencing (WES) of 274 myeloid neoplasms, including myelodysplastic syndrome (MDS, N=75), myelodysplastic/myeloproliferative neoplasia (MDS/MPN, N=33), and acute myeloid leukemia (AML, N=22), augmenting the resulting mutational data with public WES results from AML (N=144).

View Article and Find Full Text PDF

Motivation: Several software tools specialize in the alignment of short next-generation sequencing reads to a reference sequence. Some of these tools report a mapping quality score for each alignment-in principle, this quality score tells researchers the likelihood that the alignment is correct. However, the reported mapping quality often correlates weakly with actual accuracy and the qualities of many mappings are underestimated, encouraging the researchers to discard correct mappings.

View Article and Find Full Text PDF

Motivation: The advent of next-generation sequencing (NGS) techniques presents many novel opportunities for many applications in life sciences. The vast number of short reads produced by these techniques, however, pose significant computational challenges. The first step in many types of genomic analysis is the mapping of short reads to a reference genome, and several groups have developed dedicated algorithms and software packages to perform this function.

View Article and Find Full Text PDF