Publications by authors named "Su-In Lee"

Clinical and biological information in large datasets of gene expression across cancers could be tapped with unsupervised deep learning. However, difficulties associated with biological interpretability and methodological robustness have made this impractical. Here we describe an unsupervised deep-learning framework for the generation of low-dimensional latent spaces for gene-expression data from 50,211 transcriptomes across 18 human cancers.

View Article and Find Full Text PDF
Article Synopsis
  • - The study investigates how trastuzumab deruxtecan (T-DXd) works against HER2-positive gastric cancers, revealing its ability to damage DNA and induce cell death across different HER2 levels.
  • - T-DXd increases PD-L1 expression and activates immune pathways, notably the cGAS-STING pathway, leading to an enhanced immune response and inflammation in gastric cancer cells.
  • - The findings suggest that combining T-DXd with immune checkpoint blockade therapies could improve treatment outcomes for HER2-positive gastric cancer patients due to its effects on immune activation.
View Article and Find Full Text PDF

Aim: Fire blight, attributed to the bacterium Erwinia amylovora, significantly damages economically important crops, such as apples and pears. Conventional methods for managing fire blight involve the application of chemical pesticides, such as streptomycin and oxytetracycline. Nevertheless, apprehensions are increasing regarding developing antibiotic and pesticide-resistant strains, compounded by documented instances of plant toxicity.

View Article and Find Full Text PDF

Building trustworthy and transparent image-based medical artificial intelligence (AI) systems requires the ability to interrogate data and models at all stages of the development pipeline, from training models to post-deployment monitoring. Ideally, the data and associated AI systems could be described using terms already familiar to physicians, but this requires medical datasets densely annotated with semantically meaningful concepts. In the present study, we present a foundation model approach, named MONET (medical concept retriever), which learns how to connect medical images with text and densely scores images on concept presence to enable important tasks in medical AI development and deployment such as data auditing, model auditing and model interpretation.

View Article and Find Full Text PDF

Brown rot disease, caused by Monilinia spp., poses a significant threat to pome and stone fruit crops globally, resulting in substantial economic losses during pre- and post-harvest stages. Monilinia fructigena, M.

View Article and Find Full Text PDF

Clinically and biologically valuable information may reside untapped in large cancer gene expression data sets. Deep unsupervised learning has the potential to extract this information with unprecedented efficacy but has thus far been hampered by a lack of biological interpretability and robustness. Here, we present DeepProfile, a comprehensive framework that addresses current challenges in applying unsupervised deep learning to gene expression profiles.

View Article and Find Full Text PDF

Human CD300c is expressed on various immune or cancer cells and is a novel B7 family member, functioning as an activity modulator on immune cells. To elucidate the function of CD300c, we developed CL7, a human CD300c-specific monoclonal antibody, and assessed its biological activity. The specific binding of CL7 monoclonal antibody against recombinant CD300c antigen was confirmed using enzyme-linked immunosorbent assay and surface plasmon resonance analysis.

View Article and Find Full Text PDF

The inferences of most machine-learning models powering medical artificial intelligence are difficult to interpret. Here we report a general framework for model auditing that combines insights from medical experts with a highly expressive form of explainable artificial intelligence. Specifically, we leveraged the expertise of dermatologists for the clinical task of differentiating melanomas from melanoma 'lookalikes' on the basis of dermoscopic and clinical images of the skin, and the power of generative models to render 'counterfactual' images to understand the 'reasoning' processes of five medical-image classifiers.

View Article and Find Full Text PDF

The aim of this study was to investigate the regulation of lantipeptide production in Streptomyces globisporus SP6C4, which produces the novel antifungal lantipeptides conprimycin and grisin, and to identify the role of cytochrome P450 (P450) in tis regulation. To investigate the regulation of lantipeptide production, we created gene deletion mutants, including ΔP450, ΔtsrD, ΔlanM, ΔP450ΔtsrD, and ΔP450ΔlanM. These mutants were characterized in terms of their morphology, sporulation, attachment, and antifungal activity against Fusarium oxysporum.

View Article and Find Full Text PDF

Background: Biological age is a measure of health that offers insights into ageing. The existing age clocks, although valuable, often trade off accuracy and interpretability. We introduce ExplaiNAble BioLogical Age (ENABL Age), a computational framework that combines machine-learning models with explainable artificial intelligence (XAI) methods to accurately estimate biological age with individualised explanations.

View Article and Find Full Text PDF

Fire blight disease, caused by the bacterial pathogen , has been a significant concern for over 50 countries worldwide. The efficacy of chemical pesticides currently available for disease control is limited. To address this issue, research is being conducted to explore environmentally friendly control methods, particularly biological control using beneficial microorganisms.

View Article and Find Full Text PDF

Single-cell datasets are routinely collected to investigate changes in cellular state between control cells and the corresponding cells in a treatment condition, such as exposure to a drug or infection by a pathogen. To better understand heterogeneity in treatment response, it is desirable to deconvolve variations enriched in treated cells from those shared with controls. However, standard computational models of single-cell data are not designed to explicitly separate these variations.

View Article and Find Full Text PDF

Building trustworthy and transparent image-based medical AI systems requires the ability to interrogate data and models at all stages of the development pipeline: from training models to post-deployment monitoring. Ideally, the data and associated AI systems could be described using terms already familiar to physicians, but this requires medical datasets densely annotated with semantically meaningful concepts. Here, we present a foundation model approach, named MONET (edical ccept rriever), which learns how to connect medical images with text and generates dense concept annotations to enable tasks in AI transparency from model auditing to model interpretation.

View Article and Find Full Text PDF

Despite the proliferation and clinical deployment of artificial intelligence (AI)-based medical software devices, most remain black boxes that are uninterpretable to key stakeholders including patients, physicians, and even the developers of the devices. Here, we present a general model auditing framework that combines insights from medical experts with a highly expressive form of explainable AI that leverages generative models, to understand the reasoning processes of AI devices. We then apply this framework to generate the first thorough, medically interpretable picture of the reasoning processes of machine-learning-based medical image AI.

View Article and Find Full Text PDF

Epithelial-to-mesenchymal transition (EMT) plays a critical role in the development and progression of lung cancer by promoting its invasiveness and metastasis. Using integrative analyses of the public lung cancer database, we found that the expression levels of the tight junction proteins, zonula occluden (ZO)-1 and ZO-2, were lower in lung cancer tissues, including both lung adenocarcinoma and lung squamous cell carcinoma than in normal lung tissues analyzed using The Cancer Genome Atlas (TCGA). Although the ectopic expression or knockdown of ZO-1 and ZO-2 did not affect the growth of lung cancer cells, they significantly regulated cell migration and invasion.

View Article and Find Full Text PDF

Machine learning may aid the choice of optimal combinations of anticancer drugs by explaining the molecular basis of their synergy. By combining accurate models with interpretable insights, explainable machine learning promises to accelerate data-driven cancer pharmacology. However, owing to the highly correlated and high-dimensional nature of transcriptomic data, naively applying current explainable machine-learning strategies to large transcriptomic datasets leads to suboptimal outcomes.

View Article and Find Full Text PDF
Article Synopsis
  • There's a rising interest in using unsupervised deep learning for gene expression analysis, leading to the development of methods to improve model interpretability.
  • These interpretability methods fall into two categories: post hoc analyses of complex models and the design of biologically-constrained models from the start.
  • The authors suggest that combining these two approaches can be beneficial and introduce PAUSE, a method that pinpoints key sources of transcriptomic variation using both unsupervised learning and biologically-constrained neural networks.
View Article and Find Full Text PDF

A prominent trend in single-cell transcriptomics is providing spatial context alongside a characterization of each cell's molecular state. This typically requires targeting an a priori selection of genes, often covering less than 1% of the genome, and a key question is how to optimally determine the small gene panel. We address this challenge by introducing a flexible deep learning framework, PERSIST, to identify informative gene targets for spatial transcriptomics studies by leveraging reference scRNA-seq data.

View Article and Find Full Text PDF

Background: Unlike linear models which are traditionally used to study all-cause mortality, complex machine learning models can capture non-linear interrelations and provide opportunities to identify unexplored risk factors. Explainable artificial intelligence can improve prediction accuracy over linear models and reveal great insights into outcomes like mortality. This paper comprehensively analyzes all-cause mortality by explaining complex machine learning models.

View Article and Find Full Text PDF

Soybean is an important source of protein and for a wide range of agricultural, food, and industrial applications. Soybean is being affected by Xanthomonas citri pv. glycines, a causal pathogen of bacterial pustule disease, result in a reduction in yield and quality.

View Article and Find Full Text PDF

Local feature attribution methods are increasingly used to explain complex machine learning models. However, current methods are limited because they are extremely expensive to compute or are not capable of explaining a distributed series of models where each model is owned by a separate institution. The latter is particularly important because it often arises in finance where explanations are mandated.

View Article and Find Full Text PDF

Although knowledge of biological pathways is essential for interpreting results from computational biology studies, the growing number of pathway databases complicates efforts to efficiently perform pathway analysis due to high redundancies among pathways from different databases, and inconsistencies in how pathways are created and named. We introduce the PAthway Communities (PAC) framework, which reconciles pathways from different databases and reduces pathway redundancy by revealing informative groups with distinct biological functions. Uniquely applying the Louvain community detection algorithm to a network of 4847 pathways from KEGG, REACTOME and Gene Ontology databases, we identify 35 distinct and automatically annotated communities of pathways and show that they are consistent with expert-curated pathway categories.

View Article and Find Full Text PDF

Accurate artificial intelligence (AI) for disease diagnosis could lower healthcare workloads. However, when time or financial resources for gathering input data are limited, as in emergency and critical-care medicine, developing accurate AI models, which typically require inputs for many clinical variables, may be impractical. Here we report a model-agnostic cost-aware AI (CoAI) framework for the development of predictive models that optimize the trade-off between prediction performance and feature cost.

View Article and Find Full Text PDF