Publications by authors named "Didong Li"

Investigating the relationship, particularly the lead-lag effect, between time series is a common question across various disciplines, especially when uncovering biological processes. However, analyzing time series presents several challenges. Firstly, due to technical reasons, the time points at which observations are made are not at uniform intervals.

View Article and Find Full Text PDF

CDK4/6 inhibitors such as palbociclib block cell cycle progression and improve outcomes for many ER+/HER2- breast cancer patients. Unfortunately, many patients are initially resistant to the drug or develop resistance over time in part due to heterogeneity among individual tumor cells. To better understand these mechanisms of resistance, we used multiplex, single-cell imaging to profile cell cycle proteins in ER+ breast tumor cells under increasing palbociclib concentrations.

View Article and Find Full Text PDF

In the Carolina Breast Cancer Study (CBCS), clustering census tracts based on spatial location, demographic variables, and socioeconomic status is crucial for understanding how these factors influence health outcomes and cancer risk. This task, known as spatial clustering, involves identifying clusters of similar locations by considering both geographic and characteristic patterns. While standard clustering methods such as K-means, spectral clustering, and hierarchical clustering are well-studied, spatial clustering is less explored due to the inherent differences between spatial domains and their corresponding covariates.

View Article and Find Full Text PDF

Background: Delays in breast cancer diagnosis and treatment lead to worse survival and quality of life. Racial disparities in care timeliness have been reported, but few studies have examined access at multiple points along the care continuum (diagnosis, treatment initiation, treatment duration, and genomic testing).

Methods And Findings: The Carolina Breast Cancer Study (CBCS) Phase 3 is a population-based, case-only cohort (n = 2,998, 50% black) of patients with invasive breast cancer diagnoses (2008 to 2013).

View Article and Find Full Text PDF

Various decisions concerning the management, display, and diagnostic use of electronic health records (EHR) data can be automated using machine learning (ML). We describe how ML is currently applied to EHR data and how it may be applied in the near future. Both benefits and shortcomings of ML are considered.

View Article and Find Full Text PDF

Recent advances in multi-modal algorithms have driven and been driven by the increasing availability of large image-text datasets, leading to significant strides in various fields, including computational pathology. However, in most existing medical image-text datasets, the text typically provides high-level summaries that may not sufficiently describe sub-tile regions within a large pathology image. For example, an image might cover an extensive tissue area containing cancerous and healthy regions, but the accompanying text might only specify that this image is a cancer slide, lacking the nuanced details needed for in-depth analysis.

View Article and Find Full Text PDF

Spatial genomic technologies characterize the relationship between the structural organization of cells and their cellular state. Despite the availability of various spatial transcriptomic and proteomic profiling platforms, these experiments remain costly and labor-intensive. Traditionally, tissue slicing for spatial sequencing involves parallel axis-aligned sections, often yielding redundant or correlated information.

View Article and Find Full Text PDF

The brain structural connectome is generated by a collection of white matter fiber bundles constructed from diffusion weighted MRI (dMRI), acting as highways for neural activity. There has been abundant interest in studying how the structural connectome varies across individuals in relation to their traits, ranging from age and gender to neuropsychiatric outcomes. After applying tractography to dMRI to get white matter fiber bundles, a key question is how to represent the brain connectome to facilitate statistical analyses relating connectomes to traits.

View Article and Find Full Text PDF

Spatially resolved genomic technologies have allowed us to study the physical organization of cells and tissues, and promise an understanding of local interactions between cells. However, it remains difficult to precisely align spatial observations across slices, samples, scales, individuals and technologies. Here, we propose a probabilistic model that aligns spatially-resolved samples onto a known or unknown common coordinate system (CCS) with respect to phenotypic readouts (for example, gene expression).

View Article and Find Full Text PDF

Gaussian processes are widely employed as versatile modelling and predictive tools in spatial statistics, functional data analysis, computer modelling and diverse applications of machine learning. They have been widely studied over Euclidean spaces, where they are specified using covariance functions or covariograms for modelling complex dependencies. There is a growing literature on Gaussian processes over Riemannian manifolds in order to develop richer and more flexible inferential frameworks for non-Euclidean data.

View Article and Find Full Text PDF

Spatially-resolved genomic technologies have shown promise for studying the relationship between the structural arrangement of cells and their functional behavior. While numerous sequencing and imaging platforms exist for performing spatial transcriptomics and spatial proteomics profiling, these experiments remain expensive and labor-intensive. Thus, when performing spatial genomics experiments using multiple tissue slices, there is a need to select the tissue cross sections that will be maximally informative for the purposes of the experiment.

View Article and Find Full Text PDF

Spatial navigation and orientation are emerging as promising markers for altered cognition in prodromal Alzheimer's disease, and even in cognitively normal individuals at risk for Alzheimer's disease. The different APOE gene alleles confer various degrees of risk. The APOE2 allele is considered protective, APOE3 is seen as control, while APOE4 carriage is the major known genetic risk for Alzheimer's disease.

View Article and Find Full Text PDF

Gaussian processes (GPs) are a versatile nonparametric model for nonlinear regression and have been widely used to study spatiotemporal phenomena. However, standard GPs offer limited interpretability and generalizability for datasets with naturally occurring hierarchies. With large-scale, rapidly-updating electronic health record (EHR) data, we want to study patient trajectories across diverse patient cohorts while preserving patient subgroup structure.

View Article and Find Full Text PDF

This paper is concerned with the formulation and computation of average problems on the multinomial and negative multinomial models. It can be deduced that the multinomial and negative multinomial models admit complementary geometric structures. Firstly, we investigate these geometric structures by providing various useful pre-derived expressions of some fundamental geometric quantities, such as Fisher-Riemannian metrics, α -connections and α -curvatures.

View Article and Find Full Text PDF

Current tools for multivariate density estimation struggle when the density is concentrated near a non-linear subspace or manifold. Most approaches require the choice of a kernel, with the multivariate Gaussian kernel by far the most commonly used. Although heavy-tailed and skewed extensions have been proposed, such kernels cannot capture curvature in the support of the data.

View Article and Find Full Text PDF