Bioinformatics
March 2024
Summary: The Integrated Database of Small Molecules (IDSM) integrates data from small-molecule datasets, making them accessible through the SPARQL query language. Its unique feature is the ability to search for compounds through SPARQL based on their molecular structure. We extended IDSM to enable mass spectra databases to be integrated and searched for based on mass spectrum similarity.
View Article and Find Full Text PDFCertain peptide sequences, some of them as short as amino acid triplets, are significantly overpopulated in specific secondary structure motifs in folded protein structures. For example, 74% of the EAM triplet is found in α-helices, and only 3% occurs in the extended parts of proteins (typically β-sheets). In contrast, other triplets (such as VIV and IYI) appear almost exclusively in extended parts (79% and 69%, respectively).
View Article and Find Full Text PDFCurrent biological and chemical research is increasingly dependent on the reusability of previously acquired data, which typically come from various sources. Consequently, there is a growing need for database systems and databases stored in them to be interoperable with each other. One of the possible solutions to address this issue is to use systems based on Semantic Web technologies, namely on the Resource Description Framework (RDF) to express data and on the SPARQL query language to retrieve the data.
View Article and Find Full Text PDFComput Struct Biotechnol J
November 2022
Protein tunnels are essential in transporting small molecules into the active sites of enzymes. Tunnels' geometrical and physico-chemical properties influence the transport process. The tunnels are attractive hot spots for protein engineering and drug development.
View Article and Find Full Text PDFContemporary bioinformatic and chemoinformatic capabilities hold promise to reshape knowledge management, analysis and interpretation of data in natural products research. Currently, reliance on a disparate set of non-standardized, insular, and specialized databases presents a series of challenges for data access, both within the discipline and for integration and interoperability between related fields. The fundamental elements of exchange are referenced structure-organism pairs that establish relationships between distinct molecular structures and the living organisms from which they were identified.
View Article and Find Full Text PDFInteractions among amino acid residues are the principal contributor to the stability of the three-dimensional structure of a protein. The Amino Acid Interactions (INTAA) web server (https://bioinfo.uochb.
View Article and Find Full Text PDFThe Resource Description Framework (RDF), together with well-defined ontologies, significantly increases data interoperability and usability. The SPARQL query language was introduced to retrieve requested RDF data and to explore links between them. Among other useful features, SPARQL supports federated queries that combine multiple independent data source endpoints.
View Article and Find Full Text PDFMotivation: The existing connections between large databases of chemicals, proteins, metabolites and assays offer valuable resources for research in fields ranging from drug design to metabolomics. Transparent search across multiple databases provides a way to efficiently utilize these resources. To simplify such searches, many databases have adopted semantic technologies that allow interoperable querying of the datasets using SPARQL query language.
View Article and Find Full Text PDFBy combining bioinformatics with quantum-chemical calculations, we attempt to address quantitatively some of the physical principles underlying protein folding. The former allowed us to identify tripeptide sequences in existing protein three-dimensional structures with a strong preference for either helical or extended structure. The selected representatives of pro-helical and pro-extended sequences were converted into "isolated" tripeptides-capped at N- and C-termini-and these were subjected to an extensive conformational sampling and geometry optimization (typically thousands to tens of thousands of conformers for each tripeptide).
View Article and Find Full Text PDFBackground: Structure search is one of the valuable capabilities of small-molecule databases. Fingerprint-based screening methods are usually employed to enhance the search performance by reducing the number of calls to the verification procedure. In substructure search, fingerprints are designed to capture important structural aspects of the molecule to aid the decision about whether the molecule contains a given substructure.
View Article and Find Full Text PDFLarge biomolecules-proteins and nucleic acids-are composed of building blocks which define their identity, properties and binding capabilities. In order to shed light on the energetic side of interactions of amino acids between themselves and with deoxyribonucleotides, we present the Amino Acid Interaction web server (http://bioinfo.uochb.
View Article and Find Full Text PDFBackground: In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. However, we identified several deficiencies that make usage of such RDF databases restrictive or challenging for common users.
View Article and Find Full Text PDFCancer is after cardiovascular disease the most frequent cause of death in Europe. In 28 of 53 countries considered in this area it is already the leading cause of death and expected to gain even more importance until the year 2020. Amongst the large arsenal of different anti-cancer drugs, platinum drugs belong to the first developed anticancer drugs and still have a large impact on cancer therapy.
View Article and Find Full Text PDFBackground: There are many databases of small molecules focused on different aspects of research and its applications. Some tasks may require integration of information from various databases. However, determining which entries from different databases represent the same compound is not straightforward.
View Article and Find Full Text PDFBackground: Similarity search in protein databases is one of the most essential issues in computational proteomics. With the growing number of experimentally resolved protein structures, the focus shifted from sequences to structures. The area of structure similarity forms a big challenge since even no standard definition of optimal structure similarity exists in the field.
View Article and Find Full Text PDF