The process of navigating through the landscape of biomedical literature and performing searches or combining them with bioinformatics analyses can be daunting, considering the exponential growth of scientific corpora and the plethora of tools designed to mine PubMed(®) and related repositories. Herein, we present BioTextQuest v2.0, a tool for biomedical literature mining.
View Article and Find Full Text PDFSummary: We explore the nuanced temporal and epistemological distinctions among natural sciences, particularly the contrasting treatment of time and the interplay between theory and experimentation. Physics, an exemplar of mature science, relies on theoretical models for predictability and simulations. In contrast, biology, traditionally experimental, is witnessing a computational surge, with data analytics and simulations reshaping its research paradigms.
View Article and Find Full Text PDFOver the past quarter-century, the field of evolutionary biology has been transformed by the emergence of complete genome sequences and the conceptual framework known as the 'Net of Life.' This paradigm shift challenges traditional notions of evolution as a tree-like process, emphasizing the complex, interconnected network of gene flow that may blur the boundaries between distinct lineages. In this context, gene loss, rather than horizontal gene transfer, is the primary driver of gene content, with vertical inheritance playing a principal role.
View Article and Find Full Text PDFPublic-domain availability for bioinformatics software resources is a key requirement that ensures long-term permanence and methodological reproducibility for research and development across the life sciences. These issues are particularly critical for widely used, efficient, and well-proven methods, especially those developed in research settings that often face funding discontinuities. We re-launch a range of established software components for computational genomics, as legacy version 1.
View Article and Find Full Text PDFMetagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes.
View Article and Find Full Text PDFThe evolutionary rate of disordered protein regions varies greatly due to the lack of structural constraints. So far, few studies have investigated the presence/absence patterns of compositional bias, indicative of disorder, across phylogenies in conjunction with human disease. In this study, we report a genome-wide analysis of compositional bias association with disease in human proteins and their taxonomic distribution.
View Article and Find Full Text PDFSelenium-binding proteins represent a ubiquitous protein family and recently SBP1 was described as a new stress response regulator in plants. SBP1 has been characterized as a methanethiol oxidase, however its exact role remains unclear. Moreover, in mammals, it is involved in the regulation of anti-carcinogenic growth and progression as well as reduction/oxidation modulation and detoxification.
View Article and Find Full Text PDFThe nuclear pore complex exhibits different manifestations across eukaryotes, with certain components being restricted to specific clades. Several studies have been conducted to delineate the nuclear pore complex composition in various model organisms. Due to its pivotal role in cell viability, traditional lab experiments, such as gene knockdowns, can prove inconclusive and need to be complemented by a high-quality computational process.
View Article and Find Full Text PDFThe MetaSUB Consortium, founded in 2015, is a global consortium with an interdisciplinary team of clinicians, scientists, bioinformaticians, engineers, and designers, with members from more than 100 countries across the globe. This network has continually collected samples from urban and rural sites including subways and transit systems, sewage systems, hospitals, and other environmental sampling. These collections have been ongoing since 2015 and have continued when possible, even throughout the COVID-19 pandemic.
View Article and Find Full Text PDFQuantitative or qualitative differences in immunity may drive clinical severity in COVID-19. Although longitudinal studies to record the course of immunological changes are ample, they do not necessarily predict clinical progression at the time of hospital admission. Here we show, by a machine learning approach using serum pro-inflammatory, anti-inflammatory and anti-viral cytokine and anti-SARS-CoV-2 antibody measurements as input data, that COVID-19 patients cluster into three distinct immune phenotype groups.
View Article and Find Full Text PDFThe Database of Intrinsically Disordered Proteins (DisProt, URL: https://disprot.org) is the major repository of manually curated annotations of intrinsically disordered proteins and regions from the literature. We report here recent updates of DisProt version 9, including a restyled web interface, refactored Intrinsically Disordered Proteins Ontology (IDPO), improvements in the curation process and significant content growth of around 30%.
View Article and Find Full Text PDFIn urban ecosystems, microbes play a key role in maintaining major ecological functions that directly support human health and city life. However, the knowledge about the species composition and functions involved in urban environments is still limited, which is largely due to the lack of reference genomes in metagenomic studies comprises more than half of unclassified reads. Here we uncovered 732 novel bacterial species from 4728 samples collected from various common surface with the matching materials in the mass transit system across 60 cities by the MetaSUB Consortium.
View Article and Find Full Text PDFThe Covid-19 pandemic has required nonpharmaceutical interventions, primarily physical distancing, personal hygiene and face mask use, to limit community transmission, irrespective of seasons. In fact, the seasonality attributes of this pandemic remain one of its biggest unknowns. Early studies based on past experience from respiratory diseases focused on temperature or humidity, with disappointing results.
View Article and Find Full Text PDFRibosomal genes produce the constituents of the ribosome, one of the most conserved subcellular structures of all cells, from bacteria to eukaryotes, including animals. There are notions that some protein-coding ribosomal genes vary in their roles across species, particularly vertebrates, through the involvement of some in a number of genetic diseases. Based on extensive sequence comparisons and systematic curation, we establish a reference set for ribosomal proteins (RPs) in eleven vertebrate species and quantify their sequence conservation levels.
View Article and Find Full Text PDFOrf8, one of the most puzzling genes in the SARS lineage of coronaviruses, marks a unique and striking difference in genome organization between SARS-CoV-2 and SARS-CoV-1. Here, using sequence comparisons, we unequivocally reveal the distant sequence similarities between SARS-CoV-2 Orf8 with its SARS-CoV-1 counterparts and the X4-like genes of coronaviruses, including its highly divergent "paralog" gene Orf7a, whose product is a potential immune antagonist of known structure. Supervised sequence space walks unravel identity levels that drop below 10% and yet exhibit subtle conservation patterns in this novel superfamily, characterized by an immunoglobulin-like beta sandwich topology.
View Article and Find Full Text PDFComput Struct Biotechnol J
December 2020
The genome of SARS-CoV-2, the coronavirus responsible for the Covid-19 pandemic, encodes a number of accessory genes. The longest accessory gene, Orf3a, plays important roles in the virus lifecycle indicated by experimental findings, known polymorphisms, its evolutionary trajectory and a distinct three-dimensional fold. Here we show that supervised, sensitive database searches with Orf3a detect weak, yet significant and highly specific similarities to the M proteins of coronaviruses.
View Article and Find Full Text PDFIn the context of the COVID-19 pandemic, we describe here the singular metabolic background that constrains enveloped RNA viruses to evolve toward likely attenuation in the long term, possibly after a step of increased pathogenicity. Cytidine triphosphate (CTP) is at the crossroad of the processes allowing SARS-CoV-2 to multiply, because CTP is in demand for four essential metabolic steps. It is a building block of the virus genome, it is required for synthesis of the cytosine-based liponucleotide precursors of the viral envelope, it is a critical building block of the host transfer RNAs synthesis and it is required for synthesis of dolichol-phosphate, a precursor of viral protein glycosylation.
View Article and Find Full Text PDFAs high-throughput approaches in biological and biomedical research are transforming the life sciences into information-driven disciplines, modern analytics platforms for big data have started to address the needs for efficient and systematic data analysis and interpretation. We observe that radiobiology is following this general trend, with -omics information providing unparalleled depth into the biomolecular mechanisms of radiation response-defined as systems radiobiology. We outline the design of computational frameworks and discuss the analysis of big data in low-dose ionizing radiation (LDIR) responses of the mammalian brain.
View Article and Find Full Text PDFAs genome sequencing efforts are unveiling the genetic diversity of the biosphere with an unprecedented speed, there is a need to accurately describe the structural and functional properties of groups of extant species whose genomes have been sequenced, as well as their inferred ancestors, at any given taxonomic level of their phylogeny. Elaborate approaches for the reconstruction of ancestral states at the sequence level have been developed, subsequently augmented by methods based on gene content. While these approaches of sequence or gene-content reconstruction have been successfully deployed, there has been less progress on the explicit inference of functional properties of ancestral genomes, in terms of metabolic pathways and other cellular processes.
View Article and Find Full Text PDFUniversity accountants and administrators should support scientists going to meetings, not further burden them with bureaucratic hurdles, expense claims or unnecessary auditing.
View Article and Find Full Text PDFIntrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) are now recognised as major determinants in cellular regulation. This white paper presents a roadmap for future e-infrastructure developments in the field of IDP research within the ELIXIR framework. The goal of these developments is to drive the creation of high-quality tools and resources to support the identification, analysis and functional characterisation of IDPs.
View Article and Find Full Text PDFThe Database of Protein Disorder (DisProt, URL: https://disprot.org) provides manually curated annotations of intrinsically disordered proteins from the literature. Here we report recent developments with DisProt (version 8), including the doubling of protein entries, a new disorder ontology, improvements of the annotation format and a completely new website.
View Article and Find Full Text PDF