Many applications use data that are better represented in the binary matrix form, such as click-stream data, market basket data, document-term data, user-permission data in access control, and others. Matrix factorization methods have been widely used tools for the analysis of high-dimensional data, as they automatically extract sparse and meaningful features from data vectors. However, existing matrix factorization methods do not work well for the binary data. One crucial limitation is interpretability, as many matrix factorization methods decompose an input matrix into matrices with fractional or even negative components, which are hard to interpret in many real settings. Some matrix factorization methods, like binary matrix factorization, do limit decomposed matrices to binary values. However, these models are not flexible to accommodate some data analysis tasks, like trading off summary size with quality and discriminating different types of approximation errors. To address those issues, this article presents weighted rank-one binary matrix factorization, which is to approximate a binary matrix by the product of two binary vectors, with parameters controlling different types of approximation errors. By systematically running weighted rank-one binary matrix factorization, one can effectively perform various binary data analysis tasks, like compression, clustering, and pattern discovery. Theoretical properties on weighted rank-one binary matrix factorization are investigated and its connection to problems in other research domains are examined. As weighted rank-one binary matrix factorization in general is NP-hard, efficient and effective algorithms are presented. Extensive studies on applications of weighted rank-one binary matrix factorization are also conducted.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7695232 | PMC |
http://dx.doi.org/10.1145/3386599 | DOI Listing |
Front Immunol
January 2025
Department of Neurological Care Unit, The First Affiliated Hospital of YangTze University, Jingzhou, Hubei, China.
Background: Recent years have seen persistently poor prognoses for glioma patients. Therefore, exploring the molecular subtyping of gliomas, identifying novel prognostic biomarkers, and understanding the characteristics of their immune microenvironments are crucial for improving treatment strategies and patient outcomes.
Methods: We integrated glioma datasets from multiple sources, employing Non-negative Matrix Factorization (NMF) to cluster samples and filter for differentially expressed metabolic genes.
Front Immunol
January 2025
Department of Radiation Oncology, Lianyungang Second People's Hospital (Lianyungang Tumur Hospital), Lianyungang, China.
Background: Hepatocellular carcinoma (LIHC) poses a significant health challenge worldwide, primarily due to late-stage diagnosis and the limited effectiveness of current therapies. Cancer stem cells are known to play a role in tumor development, metastasis, and resistance to treatment. A thorough understanding of genes associated with stem cells is crucial for improving the diagnostic precision of LIHC and for the advancement of effective immunotherapy approaches.
View Article and Find Full Text PDFSci Rep
January 2025
Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, 18051, Germany.
Drug development is known to be a costly and time-consuming process, which is prone to high failure rates. Drug repurposing allows drug discovery by reusing already approved compounds. The outcomes of past clinical trials can be used to predict novel drug-disease associations by leveraging drug- and disease-related similarities.
View Article and Find Full Text PDFFront Immunol
January 2025
Department of Hepatobiliary Surgery, Daping Hospital, Army Medical University, Chongqing, China.
Background: Hepatocellular carcinoma (HCC) is a common malignant tumor of the digestive system with a high incidence that seriously threatens patients' lives and health. However, with the rise and application of new treatments, such as immunotherapy, there are still some restrictions in the treatment and diagnosis of HCC, and the therapeutic effects on patients are not ideal.
Methods: Two single-cell RNA sequencing (scRNA-seq) datasets from HCC patients, encompassing 25,189 cells, were analyzed in the study.
J Headache Pain
January 2025
Clinical Systems Biology Laboratories, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
Background: Migraine is a complex neurological disorder characterized by recurrent episodes of severe headaches. Although genetic factors have been implicated, the precise molecular mechanisms, particularly gene expression patterns in migraine-associated brain regions, remain unclear. This study applies machine learning techniques to explore region-specific gene expression profiles and identify critical gene programs and transcription factors linked to migraine pathogenesis.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!