This article describes the implementation of a system that is able to organize vast document collections according to textual similarities. It is based on the self-organizing map (SOM) algorithm. As the feature vectors for the documents statistical representations of their vocabularies are used. The main goal in our work has been to scale up the SOM algorithm to be able to deal with large amounts of high-dimensional data. In a practical experiment we mapped 6,840,568 patent abstracts onto a 1,002,240-node SOM. As the feature vectors we used 500-dimensional vectors of stochastic figures obtained as random projections of weighted word histograms.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/72.846729 | DOI Listing |
Sci Rep
January 2025
Department of Radiation Oncology, Henry Ford Hospital, Detroit, USA.
Best current practice in the analysis of dynamic contrast enhanced (DCE)-MRI is to employ a voxel-by-voxel model selection from a hierarchy of nested models. This nested model selection (NMS) assumes that the observed time-trace of contrast-agent (CA) concentration within a voxel, corresponds to a singular physiologically nested model. However, admixtures of different models may exist within a voxel's CA time-trace.
View Article and Find Full Text PDFSci Rep
January 2025
Faculty of Education, Universiti Kebangsaan Malaysia, Bangi, Selangor, 43600, Malaysia.
To improve the scientific accuracy and precision of children's physical fitness evaluations, this study proposes a model that combines self-organizing maps (SOM) neural networks with cluster analysis. Existing evaluation methods often rely on traditional, single statistical analyses, which struggle to handle the complexity of high-dimensional, nonlinear data, resulting in a lack of precision and personalization. This study uses the SOM neural network to reduce the dimensionality of high-dimensional health data.
View Article and Find Full Text PDFSci Rep
January 2025
Department of Radiation Oncology, University of Maryland School of Medicine, Baltimore, MD, USA.
This study addresses the limited noninvasive tools for Head and Neck Squamous Cell Carcinoma (HNSCC) progression-free survival (PFS) prediction by identifying Computed Tomography (CT)-based biomarkers for predicting prognosis. A retrospective analysis was conducted on data from 203 HNSCC patients. An ensemble feature selection involving correlation analysis, univariate survival analysis, best-subset selection, and the LASSO-Cox algorithm was used to select functional features, which were then used to build final Cox Proportional Hazards models (CPH).
View Article and Find Full Text PDFJ Immunother Cancer
January 2025
Providence Portland Medical Center, Portland, Oregon, USA.
Objectives: Multiplex immunohistochemistry and immunofluorescence (mIHC/IF) are emerging technologies that can be used to help define complex immunophenotypes in tissue, quantify immune cell subsets, and assess the spatial arrangement of marker expression. mIHC/IF assays require concerted efforts to optimize and validate the multiplex staining protocols prior to their application on slides. The best practice guidelines for staining and validation of mIHC/IF assays across platforms were previously published by this task force.
View Article and Find Full Text PDFAnn Agric Environ Med
December 2024
Faculty of Environmental Engineering, Lublin University of Technology, Lublin, Poland.
Objective: The aim of the study is to verify whether the electronic nose system - an array of 17 gas sensors with a signal analysis system - is a useful tool for the classification and preliminary assessment of the quality of drainage water.
Material And Methods: Water samples for analysis were collected in the Park Ludowy (People's Park), located next to the Bystrzyca River, near the city center of Lublin in eastern Poland. Drainage water was sampled at 4 different points.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!