Publications by authors named "Sebastian Ventura"

In recent years, significant attention has been paid to fuzzy recommender systems for housing, highlighting their ability to effectively handle the imprecision and uncertainty inherent in the real estate market. With the objective of improving the filtering of recommendations in the real estate sector, the PRISMA 2020 methodology was applied to perform new systematic reviews using its checklist on six academic databases from 1985 to 2024. RawGraph, Orange Data Minig, Jamovi and R software were used for document classification and data visualization.

View Article and Find Full Text PDF

Background: Dementia, with Alzheimer's disease (AD) being the most common type of this neurodegenerative disease, is an under-diagnosed health problem in older people. The creation of classification models based on AD risk factors using Deep Learning is a promising tool to minimize the impact of under-diagnosis.

Objective: To develop a Deep Learning model that uses clinical data from patients with dementia to classify whether they have AD.

View Article and Find Full Text PDF

Background: Lung neuroendocrine neoplasms (LungNENs) comprise a heterogeneous group of tumors ranging from indolent lesions with good prognosis to highly aggressive cancers. Carcinoids are the rarest LungNENs, display low to intermediate malignancy and may be surgically managed, but show resistance to radiotherapy/chemotherapy in case of metastasis. Molecular profiling is providing new information to understand lung carcinoids, but its clinical value is still limited.

View Article and Find Full Text PDF

Early melanoma diagnosis is the most important factor in the treatment of skin cancer and can effectively reduce mortality rates. Recently, Generative Adversarial Networks have been used to augment data, prevent overfitting and improve the diagnostic capacity of models. However, its application remains a challenging task due to the high levels of inter and intra-class variance seen in skin images, limited amounts of data, and model instability.

View Article and Find Full Text PDF

Teacher evaluation is presented as an object of study of great interest, where multiple efforts converge to establish models from the association of heterogeneous data from academic actors, one of these is the students' community, who stands out for their contribution with rich data information for the establishment of teacher evaluation in higher education. This study aims to present the search results for references on the prediction of teacher evaluation based on the associated data provided by the performance of university students. For this purpose, a systematic literature review was carried out, established by the phases of planning (search objective, research questions, inclusion and exclusion criteria), search and selection (literature control group and keywords, the definition of the search string, results filtering), and extraction (synthesis of the contributions).

View Article and Find Full Text PDF

Dysregulation of the splicing machinery is emerging as a hallmark in cancer due to its association with multiple dysfunctions in tumor cells. Inappropriate function of this machinery can generate tumor-driving splicing variants and trigger oncogenic actions. However, its role in pancreatic neuroendocrine tumors (PanNETs) is poorly defined.

View Article and Find Full Text PDF

Background: Pancreatic ductal adenocarcinoma (PDAC) is a highly lethal cancer, requiring novel treatments to target both cancer cells and cancer stem cells (CSCs). Altered splicing is emerging as both a novel cancer hallmark and an attractive therapeutic target. The core splicing factor SF3B1 is heavily altered in cancer and can be inhibited by Pladienolide-B, but its actionability in PDAC is unknown.

View Article and Find Full Text PDF

Skin cancer is one of the most common types of cancers in the world, with melanoma being the most lethal form. Automatic melanoma diagnosis from skin images has recently gained attention within the machine learning community, due to the complexity involved. In the past few years, convolutional neural network models have been commonly used to approach this issue.

View Article and Find Full Text PDF

In this paper we present a Competitive Rate-Based Algorithm (CRBA) that approximates operation of a Competitive Spiking Neural Network (CSNN). CRBA is based on modeling of the competition between neurons during a sample presentation, which can be reduced to ranking of the neurons based on a dot product operation and the use of a discrete Expectation Maximization algorithm; the latter is equivalent to the spike time-dependent plasticity rule. CRBA's performance is compared with that of CSNN on the MNIST and Fashion-MNIST datasets.

View Article and Find Full Text PDF

Background: The dataset from genes used to predict hepatitis C virus outcome was evaluated in a previous study using a conventional statistical methodology.

Objective: The aim of this study was to reanalyze this same dataset using the data mining approach in order to find models that improve the classification accuracy of the genes studied.

Methods: We built predictive models using different subsets of factors, selected according to their importance in predicting patient classification.

View Article and Find Full Text PDF

Glioblastomas remain the deadliest brain tumour, with a dismal ∼12-16-month survival from diagnosis. Therefore, identification of new diagnostic, prognostic and therapeutic tools to tackle glioblastomas is urgently needed. Emerging evidence indicates that the cellular machinery controlling the splicing process (spliceosome) is altered in tumours, leading to oncogenic splicing events associated with tumour progression and aggressiveness.

View Article and Find Full Text PDF

Melanoma is the type of skin cancer with the highest levels of mortality, and it is more dangerous because it can spread to other parts of the body if not caught and treated early. Melanoma diagnosis is a complex task, even for expert dermatologists, mainly due to the great variety of morphologies in moles of patients. Accordingly, the automatic diagnosis of melanoma is a task that poses the challenge of developing efficient computational methods that ease the diagnostic and, therefore, aid dermatologists in decision-making.

View Article and Find Full Text PDF

Deregulated splicing machinery components have shown to be associated with the development of several types of cancer and, therefore, the determination of such alterations can help the development of tumor-specific molecular targets for early prognosis and therapy. Determining such splicing components, however, is not a straightforward task mainly due to the heterogeneity of tumors, the variability across samples, and the fat-short characteristic of genomic datasets. In this work, a supervised machine learning-based methodology is proposed, allowing the determination of subsets of relevant splicing components that best discriminate samples.

View Article and Find Full Text PDF

Background: Dysregulation of splicing variants (SVs) expression has recently emerged as a novel cancer hallmark. Although the generation of aberrant SVs (e.g.

View Article and Find Full Text PDF

Multilabel learning is a challenging task demanding scalable methods for large-scale data. Feature selection has shown to improve multilabel accuracy while defying the curse of dimensionality of high-dimensional scattered data. However, the increasing complexity of multilabel feature selection, especially on continuous features, requires new approaches to manage data effectively and efficiently in distributed computing environments.

View Article and Find Full Text PDF

Multi-target regression (MTR) comprises the prediction of multiple continuous target variables from a common set of input variables. There are two major challenges when addressing the MTR problem: the exploration of the inter-target dependencies and the modeling of complex input-output relationships. This paper proposes a neural network model that is able to simultaneously address these two challenges in a flexible way.

View Article and Find Full Text PDF

Context: Nonalcoholic fatty liver disease (NAFLD) is a common obesity-associated pathology characterized by hepatic fat accumulation, which can progress to fibrosis, cirrhosis, and hepatocellular carcinoma. Obesity is associated with profound changes in gene-expression patterns of the liver, which could contribute to the onset of comorbidities.

Objective: As these alterations might be linked to a dysregulation of the splicing process, we aimed to determine whether the dysregulation in the expression of splicing machinery components could be associated with NAFLD.

View Article and Find Full Text PDF

Background: Type-2 diabetes mellitus (T2DM) is a major health problem with increasing incidence, which severely impacts cardiovascular disease. Because T2DM is associated with altered gene expression and aberrant splicing, we hypothesized that dysregulations in splicing machinery could precede, contribute to, and predict T2DM development.

Methods: A cohort of patients with cardiovascular disease (CORDIOPREV study) and without T2DM at baseline (at the inclusion of the study) was used (n = 215).

View Article and Find Full Text PDF

Pattern mining is one of the most important tasks to extract meaningful and useful information from raw data. This task aims to extract item-sets that represent any type of homogeneity and regularity in data. Although many efficient algorithms have been developed in this regard, the growing interest in data has caused the performance of existing pattern mining techniques to be dropped.

View Article and Find Full Text PDF

Real-world data usually comprise features whose interpretation depends on some contextual information. Such contextual-sensitive features and patterns are of high interest to be discovered and analyzed in order to obtain the right meaning. This paper formulates the problem of mining context-aware association rules, which refers to the search for associations between itemsets such that the strength of their implication depends on a contextual feature.

View Article and Find Full Text PDF

The growing interest in data storage has made the data size to be exponentially increased, hampering the process of knowledge discovery from these large volumes of high-dimensional and heterogeneous data. In recent years, many efficient algorithms for mining data associations have been proposed, facing up time and main memory requirements. Nevertheless, this mining process could still become hard when the number of items and records is extremely high.

View Article and Find Full Text PDF

This paper proposes a novel grammar-guided genetic programming algorithm for subgroup discovery. This algorithm, called comprehensible grammar-based algorithm for subgroup discovery (CGBA-SD), combines the requirements of discovering comprehensible rules with the ability to mine expressive and flexible solutions owing to the use of a context-free grammar. Each rule is represented as a derivation tree that shows a solution described using the language denoted by the grammar.

View Article and Find Full Text PDF

Management of hyperglycemia in hospitalized patients has a significant bearing on outcome, in terms of both morbidity and mortality. However, there are few national assessments of diabetes care during hospitalization which could serve as a baseline for change. This analysis of a large clinical database (74 million unique encounters corresponding to 17 million unique patients) was undertaken to provide such an assessment and to find future directions which might lead to improvements in patient safety.

View Article and Find Full Text PDF

Gravitation is a fundamental interaction whose concept and effects applied to data classification become a novel data classification technique. The simple principle of data gravitation classification (DGC) is to classify data samples by comparing the gravitation between different classes. However, the calculation of gravitation is not a trivial problem due to the different relevance of data attributes for distance computation, the presence of noisy or irrelevant attributes, and the class imbalance problem.

View Article and Find Full Text PDF

Objective: We introduce a web-based adaptive training simulator system to exercise cardiopulmonary resuscitation skills. Our purpose is to provide emergency physicians with an additional training tool for cardiac life support clinical cases, by integrating an adaptive learning environment with a web-based case simulator.

Methods And Materials: Adaptive systems reflect some features of the user in the user model and apply this model to adapt various visible aspects of the system to the user.

View Article and Find Full Text PDF