Analysis of differential gene expression from RNA-seq data has become a standard for several research areas. The steps for the computational analysis include many data types and file formats, and a wide variety of computational tools that can be applied alone or together as pipelines. This paper presents a review of the differential expression analysis pipeline, addressing its steps and the respective objectives, the principal methods available in each step, and their properties, therefore introducing an organized overview to this context.
View Article and Find Full Text PDFThis letter points out a conceptual error made by the authors of a published paper, which presents a review and evaluation of computational methods in lncRNA identification. The error was made in the execution of the BASiNET method when considering an example file (toy model) made available by the authors with the aim of showing how a classification model could be stored in a file for later use. In this letter, this error is contextualized, the correct use of the BASiNET method is pointed out and the results of its correct execution to one of the datasets used in the review article are presented.
View Article and Find Full Text PDFThe growth and popularization of platforms on scientific production has been the subject of several studies, producing relevant analyses of co-authorship behavior among groups of researchers. Researchers and their scientific productions can be analysed as co-authorship social networks, so researchers are linked through common publications. In this context, co-authoring networks can be analysed to find patterns that can describe or characterize them.
View Article and Find Full Text PDFThis chapter provides two main contributions: (1) a description of computational tools and databases used to identify and analyze transposable elements (TEs) and circRNAs in plants; and (2) data analysis on public TE and circRNA data. Our goal is to highlight the primary information available in the literature on circular noncoding RNAs and transposable elements in plants. The exploratory analysis performed on publicly available circRNA and TEs data help discuss four sequence features.
View Article and Find Full Text PDFBradyrhizobium diazoefficiens CPAC 7 and Bradyrhizobium japonicum CPAC 15 are broadly used in commercial inoculants in Brazil, contributing to most of the nitrogen required by the soybean crop. These strains differ in their symbiotic properties: CPAC 7 is more efficient in fixing nitrogen, whereas CPAC 15 is more competitive. Comparative genomics revealed many transposases close to genes associated with symbiosis in the symbiotic island of these strains.
View Article and Find Full Text PDFWith the emergence of Next Generation Sequencing (NGS) technologies, a large volume of sequence data in particular de novo sequencing was rapidly produced at relatively low costs. In this context, computational tools are increasingly important to assist in the identification of relevant information to understand the functioning of organisms. This work introduces BASiNET, an alignment-free tool for classifying biological sequences based on the feature extraction from complex network measurements.
View Article and Find Full Text PDFThe correct identification of differentially expressed genes (DEGs) between specific conditions is a key in the understanding phenotypic variation. High-throughput transcriptome sequencing (RNA-Seq) has become the main option for these studies. Thus, the number of methods and softwares for differential expression analysis from RNA-Seq data also increased rapidly.
View Article and Find Full Text PDFRecently, there has been an increase in the number of whole bacterial genomes sequenced, mainly due to the advancing of next-generation sequencing technologies. In face of this, there is a need to provide new analytical alternatives that can follow this advance. Given our current knowledge about the genomic plasticity of bacteria and that those genomic regions can uncover important features about this microorganism, our goal was to develop a fast methodology based on maximum entropy (ME) to guide the researcher to regions that could be prioritized during the analysis.
View Article and Find Full Text PDFBackground: The inference of gene regulatory networks (GRNs) from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements.
View Article and Find Full Text PDFBackground: Feature selection is a pattern recognition approach to choose important variables according to some criteria in order to distinguish or explain certain phenomena (i.e., for dimensionality reduction).
View Article and Find Full Text PDF