Publications by authors named "James Zou"

Sleep is a fundamental biological process with profound implications for physical and mental health, yet our understanding of its complex patterns and their relationships to a broad spectrum of diseases remains limited. While polysomnography (PSG), the gold standard for sleep analysis, captures rich multimodal physiological data, analyzing these measurements has been challenging due to limited flexibility across recording environments, poor generalizability across cohorts, and difficulty in leveraging information from multiple signals simultaneously. To address this gap, we curated over 585,000 hours of high-quality sleep recordings from approximately 65,000 participants across multiple cohorts and developed SleepFM, a multimodal sleep foundation model trained with a novel contrastive learning approach, designed to accommodate any PSG montage.

View Article and Find Full Text PDF

Background: Historically, the readability of consent forms in medicine have been above the average reading level of patients. This can create challenges in obtaining truly informed consent, but the implications on clinical trial participant retention are not fully explored. To address this gap, we seek to analyze clinical trial consent forms by determining their readability and relationship with the associated trial's participant dropout rate.

View Article and Find Full Text PDF

Evaluating the effectiveness of cancer treatments in relation to specific tumor mutations is essential for improving patient outcomes and advancing the field of precision medicine. Here we represent a comprehensive analysis of 78,287 U.S.

View Article and Find Full Text PDF

Old age is associated with a decline in cognitive function and an increase in neurodegenerative disease risk. Brain ageing is complex and is accompanied by many cellular changes. Furthermore, the influence that aged cells have on neighbouring cells and how this contributes to tissue decline is unknown.

View Article and Find Full Text PDF

Background: Cardiac amyloidosis (CA) is an underdiagnosed, progressive and lethal disease. Machine learning applied to common measurements derived from routine echocardiogram studies can inform suspicion of CA.

Objectives: Our objectives were to test a random forest (RF) model in detecting CA.

View Article and Find Full Text PDF

Large-scale gene-expression data are being leveraged to pretrain models that implicitly learn gene and cellular functions. However, such models require extensive data curation and training. Here we explore a much simpler alternative: leveraging ChatGPT embeddings of genes based on the literature.

View Article and Find Full Text PDF

Hematoxylin and eosin (H&E) is a common and inexpensive histopathology assay. Though widely used and information-rich, it cannot directly inform about specific molecular markers, which require additional experiments to assess. To address this gap, we present a deep-learning framework that computationally imputes the expression and localization of dozens of proteins from H&E images.

View Article and Find Full Text PDF
Article Synopsis
  • * The study utilizes a generative AI model called Orion to analyze blood samples from 1,050 individuals with non-small cell lung cancer (NSCLC) and matched controls, focusing on orphan non-coding RNAs.
  • * Orion significantly outperforms traditional methods, achieving 94% sensitivity and 87% specificity in cancer detection, and shows over 30% higher sensitivity on validation datasets compared to other approaches.
View Article and Find Full Text PDF

Background: Hyperkalemia, characterized by elevated serum potassium levels, heightens the risk of sudden cardiac death, particularly increasing risk for individuals with chronic kidney disease and end-stage renal disease (ESRD). Traditional laboratory test monitoring is resource-heavy, invasive, and unable to provide continuous tracking. Wearable technologies like smartwatches with electrocardiogram (ECG) capabilities are emerging as valuable tools for remote monitoring, potentially allowing for personalized monitoring with artificial intelligence (AI)-ECG interpretation.

View Article and Find Full Text PDF

Hepatocellular carcinoma (HCC) frequently recurs from minimal residual disease (MRD), which persists after therapy. Here, we identified mechanisms of persistence of residual tumor cells using post-chemoembolization human HCC (n = 108 patients, 1.07 million cells) and a transgenic mouse model of MRD.

View Article and Find Full Text PDF

In the high-stakes arena of drug discovery, the journey from bench to bedside is hindered by a daunting 92% failure rate, primarily due to unpredicted toxicities and inadequate therapeutic efficacy in clinical trials. The FDA Modernization Act 2.0 heralds a transformative approach, advocating for the integration of alternative methods to conventional animal testing, including cell-based assays that employ human induced pluripotent stem cell (iPSC)-derived organoids, and organ-on-a-chip technologies, in conjunction with sophisticated artificial intelligence (AI) methodologies.

View Article and Find Full Text PDF
Article Synopsis
  • Tissues are made up of units that can be studied at various scales, and new tech helps researchers analyze their structure and function in-depth.
  • The article introduces a method called spatial cellular graph partitioning (SCGP) for automatically annotating tissue structures without manual input, making it more efficient.
  • SCGP, along with its reference-query extension, shows strong accuracy in identifying tissue structures and offers valuable insights into diseases like diabetic kidney disease and skin disorders.
View Article and Find Full Text PDF

Language models are playing an increasingly important role in many areas of artificial intelligence (AI) and computational biology. In this primer, we discuss the ways in which language models, both those based on natural language and those based on biological sequences, can be applied to biological research. This primer is primarily intended for biologists interested in using these cutting-edge AI technologies in their applications.

View Article and Find Full Text PDF
Article Synopsis
  • Traditional biomedical AI models are limited in flexibility and can't easily use comprehensive information for real-world applications.
  • BiomedGPT is introduced as an open-source, lightweight generalist AI model capable of performing various biomedical tasks, achieving top results in many experiments.
  • It shows strong performance in tasks like radiology question answering, report generation, and summarization, indicating that training with diverse data can enhance the utility of biomedical AI in diagnosis and workflow efficiency.
View Article and Find Full Text PDF
Article Synopsis
  • This study investigates whether simpler models using standard ECG measurements can effectively detect left ventricular systolic dysfunction (LVSD) compared to complex deep learning methods.
  • Analyzing a dataset of nearly 41,000 ECGs, researchers found that a random forest model and a logistic regression model both achieved high accuracy in detecting LVSD, with performance comparable todeep learning models.
  • The findings suggest that simpler ECG models are not only effective but also easier to implement and interpret in clinical settings, making them potentially more suitable for widespread use.
View Article and Find Full Text PDF

Old age is associated with a decline in cognitive function and an increase in neurodegenerative disease risk. Brain aging is complex and accompanied by many cellular changes. However, the influence that aged cells have on neighboring cells and how this contributes to tissue decline is unknown.

View Article and Find Full Text PDF
Article Synopsis
  • Segmenting CT is essential for clinical practices like personalized cardiac ablation, but traditional machine learning methods often require large labeled datasets which are difficult to gather.
  • The article introduces the DOKEN algorithm, which uses domain knowledge to automatically label a small training set, enabling high-performance ML segmentation without the need for extensive data.
  • In tests, the DOKEN-enhanced nnU-Net model showed impressive segmentation results, achieving a high Dice score of 96.7% and demonstrating performance comparable to expert manual segmentation, thus validating its efficacy in real-world applications.
View Article and Find Full Text PDF

Radiotherapy treatment planning is a time-consuming and potentially subjective process that requires the iterative adjustment of model parameters to balance multiple conflicting objectives. Recent advancements in large foundation models offer promising avenues for addressing the challenges in planning and clinical decision-making. This study introduces GPT-RadPlan, a fully automated treatment planning framework that harnesses prior radiation oncology knowledge encoded in multi-modal large language models, such as GPT-4Vision (GPT-4V) from OpenAI.

View Article and Find Full Text PDF

Feature attribution, the ability to localize regions of the input data that are relevant for classification, is an important capability for ML models in scientific and biomedical domains. Current methods for feature attribution, which rely on "explaining" the predictions of end-to-end classifiers, suffer from imprecise feature localization and are inadequate for use with small sample sizes and high-dimensional datasets due to computational challenges. We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods that can be applied to any encoder and any data modality.

View Article and Find Full Text PDF

Motivation: Spatially resolved single-cell transcriptomics have provided unprecedented insights into gene expression in situ, particularly in the context of cell interactions or organization of tissues. However, current technologies for profiling spatial gene expression at single-cell resolution are generally limited to the measurement of a small number of genes. To address this limitation, several algorithms have been developed to impute or predict the expression of additional genes that were not present in the measured gene panel.

View Article and Find Full Text PDF

Motivation: The emergence of large chemical repositories and combinatorial chemical spaces, coupled with high-throughput docking and generative AI, have greatly expanded the chemical diversity of small molecules for drug discovery. Selecting compounds for experimental validation requires filtering these molecules based on favourable druglike properties, such as Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET).

Results: We developed ADMET-AI, a machine learning platform that provides fast and accurate ADMET predictions both as a website and as a Python package.

View Article and Find Full Text PDF

In pathology, the deployment of artificial intelligence (AI) in clinical settings is constrained by limitations in data collection and in model transparency and interpretability. Here we describe a digital pathology framework, nuclei.io, that incorporates active learning and human-in-the-loop real-time feedback for the rapid creation of diverse datasets and models.

View Article and Find Full Text PDF

A PHP Error was encountered

Severity: Warning

Message: fopen(/var/lib/php/sessions/ci_session7tev7g8olaq681hgucl43b973jiv11lc): Failed to open stream: No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 177

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once

A PHP Error was encountered

Severity: Warning

Message: session_start(): Failed to read session data: user (path: /var/lib/php/sessions)

Filename: Session/Session.php

Line Number: 137

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once