Publications by authors named "Andre C P L F Carvalho"

Infants growing up in low- and middle-income countries are at increased risk of suffering adverse childhood experiences, including exposure to environmental pollution and lack of cognitive stimulation. In this study, we aimed to examine the levels of metals in the human milk of women living in São Paulo City, Brazil, and determine the effects on infants' neurodevelopment. For such, a total of 185 human milk samples were analyzed for arsenic (As), lead (Pb), mercury (Hg), and cadmium (Cd) using inductively coupled plasma mass spectrometry (ICP-MS).

View Article and Find Full Text PDF

Machine Learning (ML) algorithms have been important tools for the extraction of useful knowledge from biological sequences, particularly in healthcare, agriculture, and the environment. However, the categorical and unstructured nature of these sequences requiring usually additional feature engineering steps, before an ML algorithm can be efficiently applied. The addition of these steps to the ML algorithm creates a processing pipeline, known as end-to-end ML.

View Article and Find Full Text PDF

The accurate classification of non-coding RNA (ncRNA) sequences is pivotal for advanced non-coding genome annotation and analysis, a fundamental aspect of genomics that facilitates understanding of ncRNA functions and regulatory mechanisms in various biological processes. While traditional machine learning approaches have been employed for distinguishing ncRNA, these often necessitate extensive feature engineering. Recently, deep learning algorithms have provided advancements in ncRNA classification.

View Article and Find Full Text PDF
Article Synopsis
  • Recent advancements in sequencing technology have led to an explosion of biological data, creating new challenges for analysis that necessitate the use of machine learning (ML) algorithms.
  • This study introduces a novel feature extractor based on Tsallis entropy to enhance the classification of biological sequences and evaluates its effectiveness through five case studies.
  • Results indicate that the Tsallis entropy method outperforms traditional Shannon entropy, demonstrating robust generalization and efficiency in dimensionality reduction compared to other techniques.
View Article and Find Full Text PDF

The field of Continual Learning investigates the ability to learn consecutive tasks without losing performance on those previously learned. The efforts of researchers have been mainly focused on incremental classification tasks. Yet, we believe that continual object detection deserves even more attention due to its vast range of applications in robotics and autonomous vehicles.

View Article and Find Full Text PDF

Recent technological advances have led to an exponential expansion of biological sequence data and extraction of meaningful information through Machine Learning (ML) algorithms. This knowledge has improved the understanding of mechanisms related to several fatal diseases, e.g.

View Article and Find Full Text PDF

One of the main challenges in applying machine learning algorithms to biological sequence data is how to numerically represent a sequence in a numeric input vector. Feature extraction techniques capable of extracting numerical information from biological sequences have been reported in the literature. However, many of these techniques are not available in existing packages, such as mathematical descriptors.

View Article and Find Full Text PDF

CRISPR-Cas systems are adaptive immune systems in prokaryotes, providing resistance against invading viruses and plasmids. The identification of CRISPR loci is currently a non-standardized, ambiguous process, requiring the manual combination of multiple tools, where existing tools detect only parts of the CRISPR-systems, and lack quality control, annotation and assessment capabilities of the detected CRISPR loci. Our CRISPRloci server provides the first resource for the prediction and assessment of all possible CRISPR loci.

View Article and Find Full Text PDF

As consequence of the various genomic sequencing projects, an increasing volume of biological sequence data is being produced. Although machine learning algorithms have been successfully applied to a large number of genomic sequence-related problems, the results are largely affected by the type and number of features extracted. This effect has motivated new algorithms and pipeline proposals, mainly involving feature extraction problems, in which extracting significant discriminatory information from a biological set is challenging.

View Article and Find Full Text PDF

Motivation: CRISPR-Cas are important systems found in most archaeal and many bacterial genomes, providing adaptive immunity against mobile genetic elements in prokaryotes. The CRISPR-Cas systems are encoded by a set of consecutive cas genes, here termed cassette. The identification of cassette boundaries is key for finding cassettes in CRISPR research field.

View Article and Find Full Text PDF

Background: CRISPR-Cas genes are extraordinarily diverse and evolve rapidly when compared to other prokaryotic genes. With the rapid increase in newly sequenced archaeal and bacterial genomes, manual identification of CRISPR-Cas systems is no longer viable. Thus, an automated approach is required for advancing our understanding of the evolution and diversity of these systems and for finding new candidates for genome engineering in eukaryotic models.

View Article and Find Full Text PDF

Human mobility has a significant impact on several layers of society, from infrastructural planning and economics to the spread of diseases and crime. Representing the system as a complex network, in which nodes are assigned to regions (e.g.

View Article and Find Full Text PDF

This study reports the empirical analysis of a hyper-heuristic evolutionary algorithm that is capable of automatically designing top-down decision-tree induction algorithms. Top-down decision-tree algorithms are of great importance, considering their ability to provide an intuitive and accurate knowledge representation for classification problems. The automatic design of these algorithms seems timely, given the large literature accumulated over more than 40 years of research in the manual design of decision-tree induction algorithms.

View Article and Find Full Text PDF

Background: This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulations. In this application, it is important not only to correctly predict the free energy of binding, but also to provide a comprehensible model that could be validated by a domain specialist.

View Article and Find Full Text PDF

Many cellular functions are carried out in specific compartments of the cell. The prediction of the cellular localization of a protein is thus related to its function identification. This paper uses two Machine Learning techniques, Support Vector Machines (SVMs) and Decision Trees, in the prediction of the localization of proteins from three categories of organisms: gram-positive and gram-negative bacteria and fungi.

View Article and Find Full Text PDF

Novelty detection techniques might be a promising way of dealing with high-dimensional classification problems in Bioinformatics. We present preliminary results of the use of a one-class support vector machine approach to detect novel classes in two Bioinformatics databases. The results are compatible with theory and inspire further investigation.

View Article and Find Full Text PDF

In this paper, a network of coupled chaotic maps for multi-scale image segmentation is proposed. Time evolutions of chaotic maps that correspond to a pixel cluster are synchronized with one another, while this synchronized evolution is desynchronized with respect to time evolution of chaotic maps corresponding to other pixel clusters in the same image. The number of pixel clusters is previously unknown and the adaptive pixel moving technique introduced in the model makes it robust enough to classify ambiguous pixels.

View Article and Find Full Text PDF