Background: Biological databases and repositories are incrementing in diversity and complexity over the years. This rapid expansion of current and new sources of biological knowledge raises serious problems of data accessibility and integration. To handle the growing necessity of unification, CellBase was created as an integrative solution. CellBase provides a centralized NoSQL database containing biological information from different and heterogeneous sources. Access to this information is done through a RESTful web service API, which provides an efficient interface to the data.

Results: In this work we present PyCellBase, a Python package that provides programmatic access to the rich RESTful web service API offered by CellBase. This package offers a fast and user-friendly access to biological information without the need of installing any local database. In addition, a series of command-line tools are provided to perform common bioinformatic tasks, such as variant annotation. CellBase data is always available by a high-availability cluster and queries have been tuned to ensure a real-time performance.

Conclusion: PyCellBase is an open-source Python package that provides an efficient access to heterogeneous biological information. It allows to perform tasks that require a comprehensive set of knowledge resources, as for example variant annotation. Queries can be easily fine-tuned to retrieve the desired information of particular biological features. PyCellBase offers the convenience of an object-oriented scripting language and provides the ability to integrate the obtained results into other Python applications and pipelines.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6438028PMC
http://dx.doi.org/10.1186/s12859-019-2726-4DOI Listing

Publication Analysis

Top Keywords

python package
12
heterogeneous sources
8
restful web
8
web service
8
service api
8
variant annotation
8
biological
7
pycellbase
4
pycellbase efficient
4
python
4

Similar Publications

Introduction: Gait analysis is a vital tool in the assessment of human movement and has been widely used in clinical settings to identify potential abnormalities in individuals. However, there is a lack of consensus on the normative values for gait metrics in large populations. The primary objective of this study is to establish a normative database of spatiotemporal gait metrics across various age groups, contributing to a broader understanding of human gait dynamics.

View Article and Find Full Text PDF

MetaboLabPy-An Open-Source Software Package for Metabolomics NMR Data Processing and Metabolic Tracer Data Analysis.

Metabolites

January 2025

Department of Metabolism and Systems Sciences, School of Medical Sciences, College of Medicine and Health, University of Birmingham, Birmingham B15 2TT, UK.

NMR spectroscopy is a powerful technique for studying metabolism, either in metabolomics settings or through tracing with stable isotope-enriched metabolic precursors. MetaboLabPy (version 0.9.

View Article and Find Full Text PDF

We introduce a new software package for the Julia programming language, the library ActiveInference.jl. To make active inference agents with Partially Observable Markov Decision Process (POMDP) generative models available to the growing research community using Julia, we re-implemented the pymdp library for Python.

View Article and Find Full Text PDF

Introducing GPU Acceleration into the Python-Based Simulations of Chemistry Framework.

J Phys Chem A

January 2025

Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States.

We introduce the first version of GPU4PySCF, a module that provides GPU acceleration of methods in PySCF. As a core functionality, this provides a GPU implementation of two-electron repulsion integrals (ERIs) for contracted basis sets comprising up to functions using the Rys quadrature. As an illustration of how this can accelerate a quantum chemistry workflow, we describe how to use the ERIs efficiently in the integral-direct Hartree-Fock build and nuclear gradient construction.

View Article and Find Full Text PDF

BaNDyT: Bayesian Network Modeling of Molecular Dynamics Trajectories.

J Chem Inf Model

January 2025

Department of Computational and Quantitative Medicine, Beckman Research Institute of the City of Hope, 1218 S 5th Ave, Monrovia, California 91016, United States.

Bayesian network modeling (BN modeling, or BNM) is an interpretable machine learning method for constructing probabilistic graphical models from the data. In recent years, it has been extensively applied to diverse types of biomedical data sets. Concurrently, our ability to perform long-time scale molecular dynamics (MD) simulations on proteins and other materials has increased exponentially.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!