IEEE/ACM Trans Comput Biol Bioinform
August 2024
Categorical attributes are common in many classification tasks, presenting certain challenges as the number of categories grows. This situation can affect data handling, negatively impacting the building time of models, their complexity and, ultimately, their classification performance. In order to mitigate these issues, this research proposes a novel preprocessing technique for grouping attribute categories in classification datasets.
View Article and Find Full Text PDFAcidophiles comprise a group of microorganisms adapted to live in acidic environments. Despite acidophiles are usually associated with an autotrophic metabolism, more than 80 microorganisms capable of utilizing organic matter have been isolated from natural and man-made environments. The ability to reduce soluble and insoluble iron compounds has been described for many of these species and may be harnessed to develop new or improved mining processes when oxidative bioleaching is ineffective.
View Article and Find Full Text PDFClustering and spatial representation methods are often used in combination, to analyse preference ratings when a large number of individuals and/or object is involved. When analysed under an unfolding model, row-conditional linear transformations are usually most appropriate when the goal is to determine clusters of individuals with similar preferences. However, a significant problem with transformations that include both slope and intercept is the occurrence of degenerate solutions.
View Article and Find Full Text PDFBiometrical sciences and disease diagnosis in particular, are often concerned with the analysis of associations for cross-classified data, for which distance association models give us a graphical interpretation for non-sparse matrices with a low number of categories. In this framework, usually binary exploratory and response variables are present, with analysis based on individual profiles being of great interest. For saturated models, we show the usual linear relationship for log-linear models is preserved in full dimension for the distance association parameterization.
View Article and Find Full Text PDFRadiation therapy plays a key role in the treatment of prostate cancer on its own. For higher risk diseases, the risk of recurrence following single modality therapy increases and a combination of treatment modalities may be necessary to achieve optimal results. We review clinical outcomes of adjuvant and salvage radiotherapy following radical prostatectomy, including disease-free survival, cancer-specific survival and overall survival.
View Article and Find Full Text PDFSurvey calibration is a widely used method to estimate the population mean or total score of a target variable, particularly in medical research. In this procedure, auxiliary information related to the variable of interest is used to recalibrate the estimation weights. However, when the auxiliary information includes qualitative variables, traditional calibration techniques may be not feasible or the optimisation procedure may fail.
View Article and Find Full Text PDFβ-Carbolines are naturally occurring bioactive alkaloids. In this work, carbohydrate-derived β-carbolines (βCs), 1-(1,3,4,5-tetrahydroxypent-1-yl)-β-carboline isomers (), 1-(1,4,5-trihydroxypent-1-yl)-β-carboline (), 1-(1,5-dihydroxypent-3-en-1-yl)-β-carboline (), and 1-(1,2,3,4,5-pentahydroxypent-1-yl)-β-carboline () were identified and analyzed in commercial foods. The concentrations of βCs in foods ranged from undetectable to 11.
View Article and Find Full Text PDFIn this article, we analyse the usefulness of multidimensional scaling in relation to performing K-means clustering on a dissimilarity matrix, when the dimensionality of the objects is unknown. In this situation, traditional algorithms cannot be used, and so K-means clustering procedures are being performed directly on the basis of the observed dissimilarity matrix. Furthermore, the application of criteria originally formulated for two-mode data sets to determine the number of clusters depends on their possible reformulation in a one-mode situation.
View Article and Find Full Text PDFIn this paper a simple but effective procedure to avoid degeneracies in ordinal Unfolding for preference rank data based on the Kemeny distance is proposed. Considering Unfolding as a particular MDS procedure with missing within-set proximities, unknown proximities are first estimated using correlations related to the Kemeny distance, and then the complete proximity matrix is analyzed in a standard MDS framework. A simulation study shows that our proposal is able to both recover the order of the preferences and reproduce the position of both rankings and objects in a geometrical space.
View Article and Find Full Text PDFMultivariate Behav Res
January 2021
Distance association models constitute a useful tool for the analysis and graphical representation of cross-classified data in which distances between points inversely describe the association between two categorical variables. When the number of cells is large and the data counts result in sparse tables, the combination of clustering and representation reduces the number of parameters to be estimated and facilitates interpretation. In this article, a latent block distance-association model is proposed to apply block clustering to the outcomes of two categorical variables while the cluster centers are represented in a low dimensional space in terms of a distance-association model.
View Article and Find Full Text PDFUnlabelled: Melanoma of the urogenital tract is extremely rare, accounting for less than 0.1% of melanoma cases. The global literature currently describes only 220 cases of penile melanoma, most commonly located on the glans penis.
View Article and Find Full Text PDFOne of the main problems in cluster analysis is that of determining the number of groups in the data. In general, the approach taken depends on the cluster method used. For K-means, some of the most widely employed criteria are formulated in terms of the decomposition of the total point scatter, regarding a two-mode data set of N points in p dimensions, which are optimally arranged into K classes.
View Article and Find Full Text PDFRev Invest Clin
August 2014
Cow's milk allergy (CMA) is an immune-based disease that has become an increasing problem. The diagnosis and management of CMA varies from one clinical setting to another and represents a challenge in pediatric practice. In addition, because nonallergic food reactions can be confused with CMA symptoms, there is an overdiagnosis of the disease.
View Article and Find Full Text PDFThe occurrence of Campylobacter species in healthy, well-nourished and healthy, malnourished children of low socioeconomic level in Southern Chile was determined. Campylobacter carriers were significantly most frequent among malnourished (31.4%) than among well-nourished (9.
View Article and Find Full Text PDFCardiac resynchronization therapy (CRT) improves symptoms and functional status in heart failure patients; however, current selection criteria need improvement. A novel tissue Doppler imaging parameter, the peak velocity difference (PVD), defined as the greatest difference in time to peak velocity between any of 6 left ventricular regions, may better select responders to CRT. Subjects were divided into 2 groups based on the PVD.
View Article and Find Full Text PDF