J R Stat Soc Series B Stat Methodol
February 2025
This paper introduces a novel framework called Mode-wise Principal Subspace Pursuit (MOP-UP) to extract hidden variations in both the row and column dimensions for matrix data. To enhance the understanding of the framework, we introduce a class of matrix-variate spiked covariance models that serve as inspiration for the development of the MOP-UP algorithm. The MOP-UP algorithm consists of two steps: Average Subspace Capture (ASC) and Alternating Projection.
View Article and Find Full Text PDFLongitudinal studies are crucial for understanding complex microbiome dynamics and their link to health. We introduce TEMPoral TEnsor Decomposition (TEMPTED), a time-informed dimensionality reduction method for high-dimensional longitudinal data that treats time as a continuous variable, effectively characterizing temporal information and handling varying temporal sampling. TEMPTED captures key microbial dynamics, facilitates beta-diversity analysis, and enhances reproducibility by transferring learned representations to new data.
View Article and Find Full Text PDFSpatial transcriptomics (ST) enables the comprehensive analysis of gene expression while preserving the spatial context of tissues. The histological images accompanying ST data provide spatially cohesive information that is often challenging to capture through gene expression alone. However, analyzing such images is challenging due to the presence of fiducial markers and background regions, which can obscure important features and complicate downstream analysis.
View Article and Find Full Text PDFObjective: Electronic health records (EHRs) are rich sources of patient-level data, offering valuable resources for medical data analysis. However, privacy concerns often restrict access to EHRs, hindering downstream analysis. Current EHR deidentification methods are flawed and can lead to potential privacy leakage.
View Article and Find Full Text PDFThis paper introduces the functional tensor singular value decomposition (FTSVD), a novel dimension reduction framework for tensors with one functional mode and several tabular modes. The problem is motivated by high-order longitudinal data analysis. Our model assumes the observed data to be a random realization of an approximate CP low-rank functional tensor measured on a discrete time grid.
View Article and Find Full Text PDFAlzheimer's disease (AD) is influenced by a variety of modifiable risk factors, including a person's dietary habits. While the ketogenic diet (KD) holds promise in reducing metabolic risks and potentially affecting AD progression, only a few studies have explored KD's metabolic impact, especially on blood and cerebrospinal fluid (CSF). Our study involved participants at risk for AD, either cognitively normal or with mild cognitive impairment.
View Article and Find Full Text PDFObjective: Sepsis is one of the most serious hospital conditions associated with high mortality. Sepsis is the result of a dysregulated immune response to infection that can lead to multiple organ dysfunction and death. Due to the wide variability in the causes of sepsis, clinical presentation, and the recovery trajectories, identifying sepsis sub-phenotypes is crucial to advance our understanding of sepsis characterization, to choose targeted treatments and optimal timing of interventions, and to improve prognostication.
View Article and Find Full Text PDFThe acquisition of exogenous mobile genetic material imposes an adaptive burden on bacteria, whereas the adaptational evolution of virulence plasmids upon entry into carbapenem-resistant Klebsiella pneumoniae (CRKP) and its impact remains unclear. To better understand the virulence in CRKP, we characterize virulence plasmids utilizing a large genomic data containing 1219 K. pneumoniae from our long-term surveillance and publicly accessible databases.
View Article and Find Full Text PDFAlzheimer's disease (AD) is influenced by a variety of modifiable risk factors, including a person's dietary habits. While the ketogenic diet (KD) holds promise in reducing metabolic risks and potentially affecting AD progression, only a few studies have explored KD's metabolic impact, especially on blood and cerebrospinal fluid (CSF). Our study involved participants at risk for AD, either cognitively normal or with mild cognitive impairment.
View Article and Find Full Text PDFThis letter considers the use of machine learning algorithms for predicting cocaine use based on magnetic resonance imaging (MRI) connectomic data. The study used functional MRI (fMRI) and diffusion MRI (dMRI) data collected from 275 individuals, which was then parcellated into 246 regions of interest (ROIs) using the Brainnetome atlas. After data preprocessing, the data sets were transformed into tensor form.
View Article and Find Full Text PDFCellular senescence is a well-established driver of aging and age-related diseases. There are many challenges to mapping senescent cells in tissues such as the absence of specific markers and their relatively low abundance and vast heterogeneity. Single-cell technologies have allowed unprecedented characterization of senescence; however, many methodologies fail to provide spatial insights.
View Article and Find Full Text PDFElectronic Health Record (EHR) data are captured over time as patients receive care. Accordingly, variations among patients, such as when a patient presents for care during the course of a disease, introduce bias into standard longitudinal EHR data analysis methods. We, therefore, aim to provide an alignment method that reduces this bias.
View Article and Find Full Text PDFIEEE Trans Inf Theory
September 2022
We study sparse group Lasso for high-dimensional double sparse linear regression, where the parameter of interest is simultaneously element-wise and group-wise sparse. This problem is an important instance of the simultaneously structured model - an actively studied topic in statistics and machine learning. In the noiseless case, matching upper and lower bounds on sample complexity are established for the exact recovery of sparse vectors and for stable estimation of approximately sparse vectors, respectively.
View Article and Find Full Text PDFIEEE Trans Inf Theory
June 2022
This paper studies a general framework for high-order tensor SVD. We propose a new computationally efficient algorithm, tensor-train orthogonal iteration (TTOI), that aims to estimate the low tensor-train rank structure from the noisy high-order tensor observation. The proposed TTOI consists of initialization via TT-SVD [1] and new iterative backward/forward updates.
View Article and Find Full Text PDFCarbapenem-resistant hypervirulent Klebsiella pneumoniae (CR-hvKP) is recognized as a threat worldwide, but the mechanisms underlying its emergence remain unclear. As most CR-hvKP isolates are not hypermucoviscous, we speculated that the evolution of the capsule might result in the convergence of carbapenem resistance and hypervirulence. Here, 2,096 K.
View Article and Find Full Text PDFRapid and accurate detection can help optimize patient treatment and improve infection control against nosocomial carbapenemase-producing organisms (CPO). In this study, a total of 217 routine clinical isolates ( and ), including 178 CPOs and 39 non-CPOs, were tested to evaluate the performance of six phenotypic carbapenemase detection and classification assays, i.e.
View Article and Find Full Text PDFObjective: Postoperative hydrocephalus occurs in one-third of children after posterior fossa tumor resection. Although models to predict the need for CSF diversion after resection exist for preoperative variables, it is unknown which postoperative variables predict the need for CSF diversion. In this study, the authors sought to determine the clinical and radiographic predictors for CSF diversion in children following posterior fossa tumor resection.
View Article and Find Full Text PDFIn this paper, we propose a general framework for sparse and low-rank tensor estimation from cubic sketchings. A two-stage non-convex implementation is developed based on sparse tensor decomposition and thresholded gradient descent, which ensures exact recovery in the noiseless case and stable recovery in the noisy case with high probability. The non-asymptotic analysis sheds light on an interplay between optimization error and statistical error.
View Article and Find Full Text PDFModel reduction of Markov processes is a basic problem in modeling state-transition systems. Motivated by the state aggregation approach rooted in control theory, we study the statistical state compression of a discrete-state Markov chain from empirical trajectories. Through the lens of spectral decomposition, we study the rank and features of Markov processes, as well as properties like representability, aggregability, and lumpability.
View Article and Find Full Text PDFTensor singular value decomposition (SVD) is a method to find a low-dimensional representation of data with meaningful structure in three or more dimensions. Tensor SVD has been applied to denoise atomic-resolution 4D scanning transmission electron microscopy (4D STEM) data. On data simulated from a SrTiO [100] perfect crystal and a Si [110] edge dislocation, tensor SVD achieved an average peak signal-to-noise ratio (PSNR) of ~40 dB, which matches or exceeds the performance of other denoising methods, with processing times at least 100 times shorter.
View Article and Find Full Text PDF