Publications by authors named "John Arevalo"

Image-based cell profiling is a powerful tool that compares perturbed cell populations by measuring thousands of single-cell features and summarizing them into profiles. Typically a sample is represented by averaging across cells, but this fails to capture the heterogeneity within cell populations. We introduce CytoSummaryNet: a Deep Sets-based approach that improves mechanism of action prediction by 30-68% in mean average precision compared to average profiling on a public dataset.

View Article and Find Full Text PDF

Image-based cell profiling is a powerful tool that compares perturbed cell populations by measuring thousands of single-cell features and summarizing them into profiles. Typically a sample is represented by averaging across cells, but this fails to capture the heterogeneity within cell populations. We introduce CytoSummaryNet: a Deep Sets-based approach that improves mechanism of action prediction by 30-68% in mean average precision compared to average profiling on a public dataset.

View Article and Find Full Text PDF

High-throughput image-based profiling platforms are powerful technologies capable of collecting data from billions of cells exposed to thousands of perturbations in a time- and cost-effective manner. Therefore, image-based profiling data has been increasingly used for diverse biological applications, such as predicting drug mechanism of action or gene function. However, batch effects severely limit community-wide efforts to integrate and interpret image-based profiling data collected across different laboratories and equipment.

View Article and Find Full Text PDF

Predicting drug efficacy and safety requires information on biological responses (e.g., cell morphology and gene expression) to small molecule perturbations.

View Article and Find Full Text PDF

In profiling assays, thousands of biological properties are measured in a single test, yielding biological discoveries by capturing the state of a cell population, often at the single-cell level. However, for profiling datasets, it has been challenging to evaluate the phenotypic activity of a sample and the phenotypic consistency among samples, due to profiles' high dimensionality, heterogeneous nature, and non-linear properties. Existing methods leave researchers uncertain where to draw boundaries between meaningful biological response and technical noise.

View Article and Find Full Text PDF

The identification of genetic and chemical perturbations with similar impacts on cell morphology can elucidate compounds' mechanisms of action or novel regulators of genetic pathways. Research on methods for identifying such similarities has lagged due to a lack of carefully designed and well-annotated image sets of cells treated with chemical and genetic perturbations. Here we create such a Resource dataset, CPJUMP1, in which each perturbed gene's product is a known target of at least two chemical compounds in the dataset.

View Article and Find Full Text PDF

Advances in high-throughput microscopy have enabled the rapid acquisition of large numbers of high-content microscopy images. Whether by deep learning or classical algorithms, image analysis pipelines then produce single-cell features. To process these single-cells for downstream applications, we present Pycytominer, a user-friendly, open-source python package that implements the bioinformatics steps, known as "image-based profiling".

View Article and Find Full Text PDF

High-throughput image-based profiling platforms are powerful technologies capable of collecting data from billions of cells exposed to thousands of perturbations in a time- and cost-effective manner. Therefore, image-based profiling data has been increasingly used for diverse biological applications, such as predicting drug mechanism of action or gene function. However, batch effects pose severe limitations to community-wide efforts to integrate and interpret image-based profiling data collected across different laboratories and equipment.

View Article and Find Full Text PDF

Multi-parametric magnetic resonance imaging (MP-MRI) has played an important role in prostate cancer diagnosis. Nevertheless, in the clinical routine, these sequences are principally analyzed from expert observations, which introduces an intrinsic variability in the diagnosis. Even worse, the isolated study of these MRI sequences trends to false positive detection due to other diseases that share similar radiological findings.

View Article and Find Full Text PDF

Clinically significant regions (CSR), captured over multi-parametric MRI (mp-MRI) images, have emerged as a potential screening test for early prostate cancer detection and characterization. These sequences are able to quantify morphology, micro-circulation, and cellular density patterns that might be related to cancer disease. Nonetheless, this evaluation is mainly carried out by expert radiologists, introducing inter-reader variability in the diagnosis.

View Article and Find Full Text PDF

Most variants in most genes across most organisms have an unknown impact on the function of the corresponding gene. This gap in knowledge is especially acute in cancer, where clinical sequencing of tumors now routinely reveals patient-specific variants whose functional impact on the corresponding genes is unknown, impeding clinical utility. Transcriptional profiling was able to systematically distinguish these variants of unknown significance as impactful vs.

View Article and Find Full Text PDF

Background And Objective: The automatic classification of breast imaging lesions is currently an unsolved problem. This paper describes an innovative representation learning framework for breast cancer diagnosis in mammography that integrates deep learning techniques to automatically learn discriminative features avoiding the design of specific hand-crafted image-based feature detectors.

Methods: A new biopsy proven benchmarking dataset was built from 344 breast cancer patients' cases containing a total of 736 film mammography (mediolateral oblique and craniocaudal) views, representative of manually segmented lesions associated with masses: 426 benign lesions and 310 malignant lesions.

View Article and Find Full Text PDF

Feature extraction is a fundamental step when mammography image analysis is addressed using learning based approaches. Traditionally, problem dependent handcrafted features are used to represent the content of images. An alternative approach successfully applied in other domains is the use of neural networks to automatically discover good features.

View Article and Find Full Text PDF

Objective: The paper addresses the problem of automatic detection of basal cell carcinoma (BCC) in histopathology images. In particular, it proposes a framework to both, learn the image representation in an unsupervised way and visualize discriminative features supported by the learned model.

Materials And Methods: This paper presents an integrated unsupervised feature learning (UFL) framework for histopathology image analysis that comprises three main stages: (1) local (patch) representation learning using different strategies (sparse autoencoders, reconstruct independent component analysis and topographic independent component analysis (TICA), (2) global (image) representation learning using a bag-of-features representation or a convolutional neural network, and (3) a visual interpretation layer to highlight the most discriminant regions detected by the model.

View Article and Find Full Text PDF

Post-translational modifications play a key role in tau protein aggregation and related neurodegeneration. Because hyperphosphorylation alone does not necessarily cause tau aggregation, other post-translational modifications have been recently explored. Tau acetylation promotes aggregation and inhibits tau's ability to stabilize microtubules.

View Article and Find Full Text PDF