Publications by Jose-Luis Ambite

Publications by authors named "Jose-Luis Ambite"

Page 1 of 2

Comparison of Explainable AI Models for MRI-based Alzheimer's Disease Classification.

Tamoghna Chattopadhyay Neha Ann Joshy Chirag Jagad Emma J Gleave Sophia I Thomopoulos Jose Luis Ambite

bioRxiv

September 2024

Article Synopsis

Deep learning models using convolutional neural networks (CNNs) have been applied to classify Alzheimer's disease and assess dementia severity through 3D brain MRI scans, with a focus on improving interpretability using occlusion sensitivity analysis (OSA) and gradient-weighted class activation mapping (Grad-CAM).
The study investigates the effectiveness of these models, primarily trained on North American datasets, when applied to a different population in India (NIMHANS cohort), and assesses the advantages of utilizing a combined dataset for training.
Results demonstrate that feature localization aligns with established knowledge of Alzheimer's disease, indicating that OSA and Grad-CAM enhance interpretability by resolving diagnostic features at various scales.

View Article and Find Full Text PDF

Expert-Level Detection of Referable Glaucoma from Fundus Photographs in a Safety Net Population: The AI and Teleophthalmology in Los Angeles Initiative.

Van Nguyen Sreenidhi Iyengar Haroon Rasheed Galo Apolo Zhiwei Li Jose-Luis Ambite

medRxiv

August 2024

Purpose: To develop and test a deep learning (DL) algorithm for detecting referable glaucoma in the Los Angeles County (LAC) Department of Health Services (DHS) teleretinal screening program.

Methods: Fundus photographs and patient-level labels of referable glaucoma (defined as cup-to-disc ratio [CDR] ≥ 0.6) provided by 21 trained optometrist graders were obtained from the LAC DHS teleretinal screening program.

View Article and Find Full Text PDF

A federated learning architecture for secure and private neuroimaging analysis.

Dimitris Stripelis Umang Gupta Hamza Saleem Nikhil Dhinagar Tanmay Ghai José Luis Ambite

Patterns (N Y)

August 2024

The amount of biomedical data continues to grow rapidly. However, collecting data from multiple sites for joint analysis remains challenging due to security, privacy, and regulatory concerns. To overcome this challenge, we use federated learning, which enables distributed training of neural network models over multiple data sources without sharing data.

View Article and Find Full Text PDF

Comparison of deep learning architectures for predicting amyloid positivity in Alzheimer's disease, mild cognitive impairment, and healthy aging, from T1-weighted brain structural MRI.

Tamoghna Chattopadhyay Saket S Ozarkar Ketaki Buwa Neha Ann Joshy Dheeraj Komandur Jose Luis Ambite

Front Neurosci

July 2024

Abnormal β-amyloid (Aβ) accumulation in the brain is an early indicator of Alzheimer's disease (AD) and is typically assessed through invasive procedures such as PET (positron emission tomography) or CSF (cerebrospinal fluid) assays. As new anti-Alzheimer's treatments can now successfully target amyloid pathology, there is a growing interest in predicting Aβ positivity (Aβ+) from less invasive, more widely available types of brain scans, such as T1-weighted (T1w) MRI. Here we compare multiple approaches to infer Aβ + from standard anatomical MRI: (1) classical machine learning algorithms, including logistic regression, XGBoost, and shallow artificial neural networks, (2) deep learning models based on 2D and 3D convolutional neural networks (CNNs), (3) a hybrid ANN-CNN, combining the strengths of shallow and deep neural networks, (4) transfer learning models based on CNNs, and (5) 3D Vision Transformers.

View Article and Find Full Text PDF

q-Diffusion leverages the full dimensionality of gene coexpression in single-cell transcriptomics.

Myrl G Marmarelis Russell Littman Francesca Battaglin Donna Niedzwiecki Alan Venook Jose-Luis Ambite

Commun Biol

April 2024

Unlocking the full dimensionality of single-cell RNA sequencing data (scRNAseq) is the next frontier to a richer, fuller understanding of cell biology. We introduce q-diffusion, a framework for capturing the coexpression structure of an entire library of genes, improving on state-of-the-art analysis tools. The method is demonstrated via three case studies.

View Article and Find Full Text PDF

NeuroBridge: a prototype platform for discovery of the long-tail neuroimaging data.

Lei Wang José Luis Ambite Abhishek Appaji Janine Bijsterbosch Jerome Dockes

Front Neuroinform

August 2023

Introduction: Open science initiatives have enabled sharing of large amounts of already collected data. However, significant gaps remain regarding how to find appropriate data, including underutilized data that exist in the long tail of science. We demonstrate the NeuroBridge prototype and its ability to search PubMed Central full-text papers for information relevant to neuroimaging data collected from schizophrenia and addiction studies.

View Article and Find Full Text PDF

NeuroBridge ontology: computable provenance metadata to give the long tail of neuroimaging data a FAIR chance for secondary use.

Satya S Sahoo Matthew D Turner Lei Wang Jose Luis Ambite Abhishek Appaji

Front Neuroinform

July 2023

Background: Despite the efforts of the neuroscience community, there are many published neuroimaging studies with data that are still not or . Users face significant challenges in neuroimaging data due to the lack of provenance metadata, such as experimental protocols, study instruments, and details about the study participants, which is also required for To implement the FAIR guidelines for neuroimaging data, we have developed an iterative ontology engineering process and used it to create the NeuroBridge ontology. The NeuroBridge ontology is a computable model of provenance terms to implement FAIR principles and together with an international effort to annotate full text articles with ontology terms, the ontology enables users to locate relevant neuroimaging datasets.

View Article and Find Full Text PDF

Enabling Scientific Reproducibility through FAIR Data Management: An ontology-driven deep learning approach in the NeuroBridge Project.

Xiaochen Wang Yue Wang José-Luis Ambite Abhishek Appaji Howard Lander

AMIA Annu Symp Proc

May 2023

Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including neuroscience, which uses imaging and electrophysiology modalities as primary endpoints or key dependency in studies. We are developing an integrated search platform called NeuroBridge to enable researchers to search for relevant study datasets that can be used to test a hypothesis or replicate a published finding without having to perform a difficult search from scratch, including contacting individual study authors and locating the site to download the data. In this paper, we describe the development of a metadata ontology based on the World Wide Web Consortium (W3C) PROV specifications to create a corpus of semantically annotated published papers.

View Article and Find Full Text PDF

Selecting Clustering Algorithms for Identity-By-Descent Mapping.

Ruhollah Shemirani Gillian M Belbin Keith Burghardt Kristina Lerman Christy L Avery José Luis Ambite

Pac Symp Biocomput

December 2022

Groups of distantly related individuals who share a short segment of their genome identical-by-descent (IBD) can provide insights about rare traits and diseases in massive biobanks using IBD mapping. Clustering algorithms play an important role in finding these groups accurately and at scale. We set out to analyze the fitness of commonly used, fast and scalable clustering algorithms for IBD mapping applications.

View Article and Find Full Text PDF

Predicted gene expression in ancestrally diverse populations leads to discovery of susceptibility loci for lifestyle and cardiometabolic traits.

Heather M Highland Genevieve L Wojcik Mariaelisa Graff Katherine K Nishimura Chani J Hodonsky Jose Luis Ambite

Am J Hum Genet

April 2022

One mechanism by which genetic factors influence complex traits and diseases is altering gene expression. Direct measurement of gene expression in relevant tissues is rarely tenable; however, genetically regulated gene expression (GReX) can be estimated using prediction models derived from large multi-omic datasets. These approaches have led to the discovery of many gene-trait associations, but whether models derived from predominantly European ancestry (EA) reference panels can map novel associations in ancestrally diverse populations remains unclear.

View Article and Find Full Text PDF

Using dynamic time warping self-organizing maps to characterize diurnal patterns in environmental exposures.

Kenan Li Katherine Sward Huiyu Deng John Morrison Rima Habre Jose Luis Ambite

Sci Rep

December 2021

Advances in measurement technology are producing increasingly time-resolved environmental exposure data. We aim to gain new insights into exposures and their potential health impacts by moving beyond simple summary statistics (e.g.

View Article and Find Full Text PDF

NERO: a biomedical named-entity (recognition) ontology with a large, annotated corpus reveals meaningful associations through text embedding.

Kanix Wang Robert Stevens Halima Alachram Yu Li Larisa Soldatova José Luis Ambite

NPJ Syst Biol Appl

October 2021

Machine reading (MR) is essential for unlocking valuable knowledge contained in millions of existing biomedical documents. Over the last two decades, the most dramatic advances in MR have followed in the wake of critical corpus development. Large, well-annotated corpora have been associated with punctuated advances in MR methodology and automated knowledge extraction systems in the same way that ImageNet was fundamental for developing machine vision techniques.

View Article and Find Full Text PDF

W-TSS: A Wavelet-Based Algorithm for Discovering Time Series Shapelets.

Kenan Li Huiyu Deng John Morrison Rima Habre Meredith Franklin José Luis Ambite

Sensors (Basel)

August 2021

Many approaches to time series classification rely on machine learning methods. However, there is growing interest in going beyond black box prediction models to understand discriminatory features of the time series and their associations with outcomes. One promising method is time-series shapelets (TSS), which identifies maximally discriminative subsequences of time series.

View Article and Find Full Text PDF

Rapid detection of identity-by-descent tracts for mega-scale datasets.

Ruhollah Shemirani Gillian M Belbin Christy L Avery Eimear E Kenny Christopher R Gignoux José Luis Ambite

Nat Commun

June 2021

The ability to identify segments of genomes identical-by-descent (IBD) is a part of standard workflows in both statistical and population genetics. However, traditional methods for finding local IBD across all pairs of individuals scale poorly leading to a lack of adoption in very large-scale datasets. Here, we present iLASH, an algorithm based on similarity detection techniques that shows equal or improved accuracy in simulations compared to current leading methods and speeds up analysis by several orders of magnitude on genomic datasets, making IBD estimation tractable for millions of individuals.

View Article and Find Full Text PDF

Toward a fine-scale population health monitoring system.

Gillian M Belbin Sinead Cullina Stephane Wenric Emily R Soper Benjamin S Glicksberg Jose-Luis Ambite

Cell

April 2021

Article Synopsis

Understanding health disparities is crucial for creating fair precision health initiatives, as traditional race and ethnicity definitions may not accurately reflect disease burdens in specific communities.
The study proposes using electronic health records (EHRs) and genomic data to analyze links between genetic ancestry and health outcomes, identifying 17 communities in NYC with shared genetic backgrounds.
Findings reveal significant health outcome variations linked to these communities, highlighting the importance of integrating genomic data with EHRs for better monitoring and prediction of health risks across different populations.

View Article and Find Full Text PDF

EPS: automated feature selection in case-control studies using extreme pseudo-sampling.

Ruhollah Shemirani Stephane Wenric Eimear Kenny José Luis Ambite

Bioinformatics

October 2021

Summary: Finding informative predictive features in high-dimensional biological case-control datasets is challenging. The Extreme Pseudo-Sampling (EPS) algorithm offers a solution to the challenge of feature selection via a combination of deep learning and linear regression models. First, using a variational autoencoder, it generates complex latent representations for the samples.

View Article and Find Full Text PDF

BD2K Training Coordinating Center's ERuDIte: the Educational Resource Discovery Index for Data Science.

José Luis Ambite Lily Fierro Jonathan Gordon Gully A Burns Florian Geigl

IEEE Trans Emerg Top Comput

March 2019

Data science is a field that has developed to enable efficient integration and analysis of increasingly large data sets in many domains. In particular, big data in genetics, neuroimaging, mobile health, and other subfields of biomedical science, promises new insights, but also poses challenges. To address these challenges, the National Institutes of Health launched the Big Data to Knowledge (BD2K) initiative, including a Training Coordinating Center (TCC) tasked with developing a resource for personalized data science training for biomedical researchers.

View Article and Find Full Text PDF

Minority-centric meta-analyses of blood lipid levels identify novel loci in the Population Architecture using Genomics and Epidemiology (PAGE) study.

Yao Hu Mariaelisa Graff Jeffrey Haessler Steven Buyske Stephanie A Bien Jose Luis Ambite

PLoS Genet

March 2020

Lipid levels are important markers for the development of cardio-metabolic diseases. Although hundreds of associated loci have been identified through genetic association studies, the contribution of genetic factors to variation in lipids is not fully understood, particularly in U.S.

View Article and Find Full Text PDF

A phenome-wide association study (PheWAS) in the Population Architecture using Genomics and Epidemiology (PAGE) study reveals potential pleiotropy in African Americans.

Sarah A Pendergrass Steven Buyske Janina M Jeff Alex Frase Scott Dudek Jose-Luis Ambite

PLoS One

April 2020

We performed a hypothesis-generating phenome-wide association study (PheWAS) to identify and characterize cross-phenotype associations, where one SNP is associated with two or more phenotypes, between thousands of genetic variants assayed on the Metabochip and hundreds of phenotypes in 5,897 African Americans as part of the Population Architecture using Genomics and Epidemiology (PAGE) I study. The PAGE I study was a National Human Genome Research Institute-funded collaboration of four study sites accessing diverse epidemiologic studies genotyped on the Metabochip, a custom genotyping chip that has dense coverage of regions in the genome previously associated with cardio-metabolic traits and outcomes in mostly European-descent populations. Here we focus on identifying novel phenome-genome relationships, where SNPs are associated with more than one phenotype.

View Article and Find Full Text PDF

Genetic analyses of diverse populations improves discovery for complex traits.

Genevieve L Wojcik Mariaelisa Graff Katherine K Nishimura Ran Tao Jeffrey Haessler Jose Luis Ambite

Nature

June 2019

Genome-wide association studies (GWAS) have laid the foundation for investigations into the biology of complex traits, drug development and clinical guidelines. However, the majority of discovery efforts are based on data from populations of European ancestry. In light of the differential genetic architecture that is known to exist between populations, bias in representation can exacerbate existing disease and healthcare disparities.

View Article and Find Full Text PDF

Genetics of Chronic Kidney Disease Stages Across Ancestries: The PAGE Study.

Bridget M Lin Girish N Nadkarni Ran Tao Mariaelisa Graff Myriam Fornage Jose Luis Ambite

Front Genet

May 2019

Background: Chronic kidney disease (CKD) is common and disproportionally burdens United States ethnic minorities. Its genetic determinants may differ by disease severity and clinical stages. To uncover genetic factors associated CKD severity among high-risk ethnic groups, we performed genome-wide association studies (GWAS) in diverse populations within the Population Architecture using Genomics and Epidemiology (PAGE) study.

View Article and Find Full Text PDF

Applying Multivariate Segmentation Methods to Human Activity Recognition From Wearable Sensors' Data.

Kenan Li Rima Habre Huiyu Deng Robert Urman John Morrison José Luis Ambite

JMIR Mhealth Uhealth

February 2019

Background: Time-resolved quantification of physical activity can contribute to both personalized medicine and epidemiological research studies, for example, managing and identifying triggers of asthma exacerbations. A growing number of reportedly accurate machine learning algorithms for human activity recognition (HAR) have been developed using data from wearable devices (eg, smartwatch and smartphone). However, many HAR algorithms depend on fixed-size sampling windows that may poorly adapt to real-world conditions in which activity bouts are of unequal duration.

View Article and Find Full Text PDF

The genetic underpinnings of variation in ages at menarche and natural menopause among women from the multi-ethnic Population Architecture using Genomics and Epidemiology (PAGE) Study: A trans-ethnic meta-analysis.

Lindsay Fernández-Rhodes Jennifer R Malinowski Yujie Wang Ran Tao Nathan Pankratz Jose Luis Ambite

PLoS One

January 2019

Current knowledge of the genetic architecture of key reproductive events across the female life course is largely based on association studies of European descent women. The relevance of known loci for age at menarche (AAM) and age at natural menopause (ANM) in diverse populations remains unclear. We investigated 32 AAM and 14 ANM previously-identified loci and sought to identify novel loci in a trans-ethnic array-wide study of 196,483 SNPs on the MetaboChip (Illumina, Inc.

View Article and Find Full Text PDF

Discovery, fine-mapping, and conditional analyses of genetic variants associated with C-reactive protein in multiethnic populations using the Metabochip in the Population Architecture using Genomics and Epidemiology (PAGE) study.

Jonathan M Kocarnik Melissa Richard Misa Graff Jeffrey Haessler Stephanie Bien José Luis Ambite

Hum Mol Genet

August 2018

C-reactive protein (CRP) is a circulating biomarker indicative of systemic inflammation. We aimed to evaluate genetic associations with CRP levels among non-European-ancestry populations through discovery, fine-mapping and conditional analyses. A total of 30 503 non-European-ancestry participants from 6 studies participating in the Population Architecture using Genomics and Epidemiology study had serum high-sensitivity CRP measurements and ∼200 000 single nucleotide polymorphisms (SNPs) genotyped on the Metabochip.

View Article and Find Full Text PDF

A Scalable Data Integration and Analysis Architecture for Sensor Data of Pediatric Asthma.

Dimitris Stripelis José Luis Ambite Yao-Yi Chiang Sandrah P Eckel Rima Habre

Proc Int Conf Data Eng

April 2017

According to the Centers for Disease Control, in the United States there are 6.8 million children living with asthma. Despite the importance of the disease, the available prognostic tools are not sufficient for biomedical researchers to thoroughly investigate the potential risks of the disease at scale.

View Article and Find Full Text PDF

Publications by authors named "Jose-Luis Ambite"

Comparison of Explainable AI Models for MRI-based Alzheimer's Disease Classification.

Article Synopsis

Expert-Level Detection of Referable Glaucoma from Fundus Photographs in a Safety Net Population: The AI and Teleophthalmology in Los Angeles Initiative.

A federated learning architecture for secure and private neuroimaging analysis.

Comparison of deep learning architectures for predicting amyloid positivity in Alzheimer's disease, mild cognitive impairment, and healthy aging, from T1-weighted brain structural MRI.

q-Diffusion leverages the full dimensionality of gene coexpression in single-cell transcriptomics.

NeuroBridge: a prototype platform for discovery of the long-tail neuroimaging data.

NeuroBridge ontology: computable provenance metadata to give the long tail of neuroimaging data a FAIR chance for secondary use.

Enabling Scientific Reproducibility through FAIR Data Management: An ontology-driven deep learning approach in the NeuroBridge Project.

Selecting Clustering Algorithms for Identity-By-Descent Mapping.

Predicted gene expression in ancestrally diverse populations leads to discovery of susceptibility loci for lifestyle and cardiometabolic traits.

Using dynamic time warping self-organizing maps to characterize diurnal patterns in environmental exposures.

NERO: a biomedical named-entity (recognition) ontology with a large, annotated corpus reveals meaningful associations through text embedding.

W-TSS: A Wavelet-Based Algorithm for Discovering Time Series Shapelets.

Rapid detection of identity-by-descent tracts for mega-scale datasets.

Toward a fine-scale population health monitoring system.

Article Synopsis

EPS: automated feature selection in case-control studies using extreme pseudo-sampling.

BD2K Training Coordinating Center's ERuDIte: the Educational Resource Discovery Index for Data Science.

Minority-centric meta-analyses of blood lipid levels identify novel loci in the Population Architecture using Genomics and Epidemiology (PAGE) study.

A phenome-wide association study (PheWAS) in the Population Architecture using Genomics and Epidemiology (PAGE) study reveals potential pleiotropy in African Americans.

Genetic analyses of diverse populations improves discovery for complex traits.

Genetics of Chronic Kidney Disease Stages Across Ancestries: The PAGE Study.

Applying Multivariate Segmentation Methods to Human Activity Recognition From Wearable Sensors' Data.

The genetic underpinnings of variation in ages at menarche and natural menopause among women from the multi-ethnic Population Architecture using Genomics and Epidemiology (PAGE) Study: A trans-ethnic meta-analysis.

Discovery, fine-mapping, and conditional analyses of genetic variants associated with C-reactive protein in multiethnic populations using the Metabochip in the Population Architecture using Genomics and Epidemiology (PAGE) study.

A Scalable Data Integration and Analysis Architecture for Sensor Data of Pediatric Asthma.

A PHP Error was encountered

A PHP Error was encountered