Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5805351 | PMC |
http://dx.doi.org/10.1371/journal.pcbi.1005968 | DOI Listing |
We introduce a computational topology-based approach with unsupervised machine-learning algorithms to estimate the database size and content of RNA-like graph topologies. Specifically, we apply graph theory enumeration to generate all 110,667 possible 2D dual graphs for vertex numbers ranging from 2 to 9. Among them, only 0.
View Article and Find Full Text PDFProtein Sci
February 2025
Department of Physics, University of Washington, Seattle, Washington, USA.
Proteins' flexibility is a feature in communicating changes in cell signaling instigated by binding with secondary messengers, such as calcium ions, associated with the coordination of muscle contraction, neurotransmitter release, and gene expression. When binding with the disordered parts of a protein, calcium ions must balance their charge states with the shape of calcium-binding proteins and their versatile pool of partners depending on the circumstances they transmit. Accurately determining the ionic charges of those ions is essential for understanding their role in such processes.
View Article and Find Full Text PDFJ Mol Graph Model
January 2025
Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow Campus, Gomtinagar Extension, Lucknow, 226028, India; Research Cell, Amity University Uttar Pradesh, Lucknow Campus, India. Electronic address:
The Acinetobacter baumannii is a member of the "ESKAPE" bacteria responsible for many serious multidrug-resistant (MDR) illnesses. This bacteria swiftly adapts to environmental cues leading to the emergence of multidrug-resistant variants, particularly in hospital/medical settings. In this work, we have demonstrated the outer membrane protein 33-36 (Omp33-36) porin as a potential therapeutic target in A.
View Article and Find Full Text PDFMol Divers
January 2025
Key Laboratory for Macromolecular Science of Shaanxi Province, School of Chemistry and Chemical Engineering, Shaanxi Normal University, Xi'an, 710119, People's Republic of China.
Molecular Property Prediction (MPP) is a fundamental task in important research fields such as chemistry, materials, biology, and medicine, where traditional computational chemistry methods based on quantum mechanics often consume substantial time and computing power. In recent years, machine learning has been increasingly used in computational chemistry, in which graph neural networks have shown good performance in molecular property prediction tasks, but they have some limitations in terms of generalizability, interpretability, and certainty. In order to address the above challenges, a Multiscale Molecular Structural Neural Network (MMSNet) is proposed in this paper, which obtains rich multiscale molecular representations through the information fusion between bonded and non-bonded "message passing" structures at the atomic scale and spatial feature information "encoder-decoder" structures at the molecular scale; a multi-level attention mechanism is introduced on the basis of theoretical analysis of molecular mechanics in order to enhance the model's interpretability; the prediction results of MMSNet are used as label values and clustered in the molecular library by the K-NN (K-Nearest Neighbors) algorithm to reverse match the spatial structure of the molecules, and the certainty of the model is quantified by comparing virtual screening results across different K-values.
View Article and Find Full Text PDFMolecules
January 2025
GSK Carbon Neutral Laboratories for Sustainable Chemistry, Jubilee Campus, University of Nottingham, Triumph Road, Nottingham NG7 2TU, UK.
The range of chemical databases available has dramatically increased in recent years, but the reliability and quality of their data are often negatively affected by human-error fidelity. The size of chemical databases can make manual data curation/checking of such sets time consuming; thus, automated tools to help this process are highly desirable. Herein, we propose the use of Graph Neural Networks (GNNs) to identifying potential stereochemical misassignments in the primary asymmetric catalysis literature.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!