Inverted repeats are repetitive elements that can form hairpin and cruciform structures. They are linked to genomic instability, however they also have various biological functions. Their distribution differs markedly across taxonomic groups in the tree of life, and they exhibit high polymorphism due to their inherent genomic instability.
View Article and Find Full Text PDFMotivation: Genome and Proteome Alignments, represented by the Multiple Alignment File (MAF) format, have become a standard approach in the field of comparative genomics and proteomics. However, current approaches lack a direct method for motif detection within MAF files. To address this gap, we present MAFin, a novel tool that enables efficient motif detection and conservation analysis in MAF files, streamlining genomic and proteomic research.
View Article and Find Full Text PDFRegulatory elements (enhancers) are major drivers of gene expression in mammals and harbor many genetic variants associated with human diseases. Here, we present an updated VISTA Enhancer Browser (https://enhancer.lbl.
View Article and Find Full Text PDFThe process of navigating through the landscape of biomedical literature and performing searches or combining them with bioinformatics analyses can be daunting, considering the exponential growth of scientific corpora and the plethora of tools designed to mine PubMed(®) and related repositories. Herein, we present BioTextQuest v2.0, a tool for biomedical literature mining.
View Article and Find Full Text PDFComput Struct Biotechnol J
December 2024
The rapid progression of genomics and proteomics has been driven by the advent of advanced sequencing technologies, large, diverse, and readily available omics datasets, and the evolution of computational data processing capabilities. The vast amount of data generated by these advancements necessitates efficient algorithms to extract meaningful information. K-mers serve as a valuable tool when working with large sequencing datasets, offering several advantages in computational speed and memory efficiency and carrying the potential for intrinsic biological functionality.
View Article and Find Full Text PDFComput Struct Biotechnol J
December 2024
The fields of Metagenomics and Metatranscriptomics involve the examination of complete nucleotide sequences, gene identification, and analysis of potential biological functions within diverse organisms or environmental samples. Despite the vast opportunities for discovery in metagenomics, the sheer volume and complexity of sequence data often present challenges in processing analysis and visualization. This article highlights the critical role of advanced visualization tools in enabling effective exploration, querying, and analysis of these complex datasets.
View Article and Find Full Text PDFThe decrease in sequencing expenses has facilitated the creation of reference genomes and proteomes for an expanding array of organisms. Nevertheless, no established repository that details organism-specific genomic and proteomic sequences of specific lengths, referred to as kmers, exists to our knowledge. In this article, we present kmerDB, a database accessible through an interactive web interface that provides kmer-based information from genomic and proteomic sequences in a systematic way.
View Article and Find Full Text PDFMassively parallel reporter assays (MPRAs) represent a set of high-throughput technologies that measure the functional effects of thousands of sequences/variants on gene regulatory activity. There are several different variations of MPRA technology and they are used for numerous applications, including regulatory element discovery, variant effect measurement, saturation mutagenesis, synthetic regulatory element generation or characterization of evolutionary gene regulatory differences. Despite their many designs and uses, there is no comprehensive database that incorporates the results of these experiments.
View Article and Find Full Text PDFAnalysis and interpretation of high-throughput transcriptional and chromatin accessibility data at single-cell (sc) resolution are still open challenges in the biomedical field. The existence of countless bioinformatics tools, for the different analytical steps, increases the complexity of data interpretation and the difficulty to derive biological insights. In this article, we present SCALA, a bioinformatics tool for analysis and visualization of single-cell RNA sequencing (scRNA-seq) and Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq) datasets, enabling either independent or integrative analysis of the two modalities.
View Article and Find Full Text PDFMetagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes.
View Article and Find Full Text PDFThe Novel Metagenome Protein Families Database (NMPFamsDB) is a database of metagenome- and metatranscriptome-derived protein families, whose members have no hits to proteins of reference genomes or Pfam domains. Each protein family is accompanied by multiple sequence alignments, Hidden Markov Models, taxonomic information, ecosystem and geolocation metadata, sequence and structure predictions, as well as 3D structure models predicted with AlphaFold2. In its current version, NMPFamsDB hosts over 100 000 protein families, each with at least 100 members.
View Article and Find Full Text PDFUnlabelled: Functional enrichment is the process of identifying implicated functional terms from a given input list of genes or proteins. In this article, we present Flame (v2.0), a web tool which offers a combinatorial approach through merging and visualizing results from widely used functional enrichment applications while also allowing various flexible input options.
View Article and Find Full Text PDFNAR Genom Bioinform
June 2023
Arena3D is an interactive web tool that visualizes multi-layered networks in 3D space. In this update, Arena3D supports directed networks as well as up to nine different types of connections between pairs of nodes with the use of Bézier curves. It comes with different color schemes (light/gray/dark mode), custom channel coloring, four node clustering algorithms which one can run on-the-fly, visualization in VR mode and predefined layer layouts (zig-zag, star and cube).
View Article and Find Full Text PDFFront Bioeng Biotechnol
March 2023
[This corrects the article DOI: 10.3389/fbioe.2020.
View Article and Find Full Text PDFMetagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families.
View Article and Find Full Text PDFUndeniably, biological age can significantly differ between individuals of similar chronological age. Longitudinal, deep multi-omic profiling has recently enabled the identification of individuals with distinct aging phenotypes, termed 'ageotypes'. This effort has provided a plethora of data and new insights into the diverse molecular mechanisms presumed to drive aging.
View Article and Find Full Text PDFMotivation: Network biology is a dominant player in today's multi-omics era. Therefore, the need for visualization tools which can efficiently cope with intra-network heterogeneity emerges.
Results: NORMA-2.
Rheumatoid arthritis (RA) is characterized by autoimmune joint destruction with debilitating consequences. Despite treatment advancements with biologic therapies, a significant proportion of RA patients show an inadequate clinical response, and restoration of immune self-tolerance represents an unmet therapeutic need. We have previously described a tolerogenic phenotype of plasmacytoid dendritic cells (pDCs) in RA patients responding to anti-TNF-α agents.
View Article and Find Full Text PDFProtein-protein interactions (PPIs) are of key importance for understanding how cells and organisms function. Thus, in recent decades, many approaches have been developed for the identification and discovery of such interactions. These approaches addressed the problem of PPI identification either by an experimental point of view or by a computational one.
View Article and Find Full Text PDFAvailable drugs have been used as an urgent attempt through clinical trials to minimize severe cases of hospitalizations with Coronavirus disease (COVID-19), however, there are limited data on common pharmacogenomics affecting concomitant medications response in patients with comorbidities. To identify the genomic determinants that influence COVID-19 susceptibility, we use a computational, statistical, and network biology approach to analyze relationships of ineffective concomitant medication with an adverse effect on patients. We statistically construct a pharmacogenetic/biomarker network with significant drug-gene interactions originating from gene-disease associations.
View Article and Find Full Text PDFWhole Exome Sequencing (WES) is used for querying DNA variants using the protein coding parts of genomes (exomes). However, WES analysis can be challenging because of the complexity of the data. Here, we describe a consolidated protocol for unbiased WES analysis.
View Article and Find Full Text PDFHoneybees (Apis mellifera) continue to succumb to human and environmental pressures despite their crucial role in providing essential ecosystem services. Owing to their foraging and honey production activities, honeybees form complex relationships with species across all domains, such as plants, viruses, bacteria and other hive pests, making honey a valuable biomonitoring tool for assessing their ecological niche. Thus, the application of honey shotgun metagenomics (SM) has paved the way for a detailed description of the species honeybees interact with.
View Article and Find Full Text PDFFinding, exploring and filtering frequent sentence-based associations between a disease and a biomedical entity, co-mentioned in disease-related PubMed literature, is a challenge, as the volume of publications increases. Darling is a web application, which utilizes Name Entity Recognition to identify human-related biomedical terms in PubMed articles, mentioned in OMIM, DisGeNET and Human Phenotype Ontology (HPO) disease records, and generates an interactive biomedical entity association network. Nodes in this network represent genes, proteins, chemicals, functions, tissues, diseases, environments and phenotypes.
View Article and Find Full Text PDFTo elucidate ecosystem functioning, it is fundamental to recognize what processes occur in which environments (where) and which microorganisms carry them out (who). Here, we present PREGO, a one-stop-shop knowledge base providing such associations. PREGO combines text mining and data integration techniques to mine such what-where-who associations from data and metadata scattered in the scientific literature and in public omics repositories.
View Article and Find Full Text PDFThe reasons behind the clinical variability of SARS-CoV-2 infection, ranging from asymptomatic infection to lethal disease, are still unclear. We performed genome-wide transcriptional whole-blood RNA sequencing, bioinformatics analysis and PCR validation to test the hypothesis that immune response-related gene signatures reflecting baseline may differ between healthy individuals, with an equally robust antibody response, who experienced an entirely asymptomatic (n=17) clinical SARS-CoV-2 infection (n=15) in the past months (mean of 14 weeks). Among 12.
View Article and Find Full Text PDF