PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions.

J Cheminform

Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Yuelu District, Changsha, People's Republic of China.

Published: March 2018

Background: With the increasing development of biotechnology and informatics technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these data needs to be extracted and transformed to useful knowledge by various data mining methods. Considering the amazing rate at which data are accumulated in chemistry and biology fields, new tools that process and interpret large and complex interaction data are increasingly important. So far, there are no suitable toolkits that can effectively link the chemical and biological space in view of molecular representation. To further explore these complex data, an integrated toolkit for various molecular representation is urgently needed which could be easily integrated with data mining algorithms to start a full data analysis pipeline.

Results: Herein, the python library PyBioMed is presented, which comprises functionalities for online download for various molecular objects by providing different IDs, the pretreatment of molecular structures, the computation of various molecular descriptors for chemicals, proteins, DNAs and their interactions. PyBioMed is a feature-rich and highly customized python library used for the characterization of various complex chemical and biological molecules and interaction samples. The current version of PyBioMed could calculate 775 chemical descriptors and 19 kinds of chemical fingerprints, 9920 protein descriptors based on protein sequences, more than 6000 DNA descriptors from nucleotide sequences, and interaction descriptors from pairwise samples using three different combining strategies. Several examples and five real-life applications were provided to clearly guide the users how to use PyBioMed as an integral part of data analysis projects. By using PyBioMed, users are able to start a full pipelining from getting molecular data, pretreating molecules, molecular representation to constructing machine learning models conveniently.

Conclusion: PyBioMed provides various user-friendly and highly customized APIs to calculate various features of biological molecules and complex interaction samples conveniently, which aims at building integrated analysis pipelines from data acquisition, data checking, and descriptor calculation to modeling. PyBioMed is freely available at http://projects.scbdd.com/pybiomed.html .

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5861255PMC
http://dx.doi.org/10.1186/s13321-018-0270-2DOI Listing

Publication Analysis

Top Keywords

python library
12
data
12
molecular representation
12
pybiomed
8
molecular
8
chemicals proteins
8
proteins dnas
8
dnas interactions
8
chemistry biology
8
data mining
8

Similar Publications

Summary: Time-lapse 3D imaging is fundamental for studying biological processes but requires software able to handle terabytes of voxel data. Although many multidimensional viewing applications exist, they mostly lack support for heterogeneous voxel counts, datatypes, and modalities in a single timeline. Open Chrono-Morph Viewer provides a straightforward graphical user interface to quickly investigate multi-timescale datasets represented as separate volume files in the common NRRD format for compatibility between toolchains.

View Article and Find Full Text PDF

Hypertension is one of the most important chronic diseases worldwide. Hypertension is a critical condition encountered frequently in daily life, forming a significant area of service in Primary Health Care (PHC), which healthcare professionals often confront. It serves as a precursor to many critical illnesses and can lead to fatalities if not addressed promptly.

View Article and Find Full Text PDF

Epilepsy is a prevalent neurological disorder characterized by seizures that significantly impact individuals and their social environments. Given the unpredictable nature of epileptic seizures, developing automated epilepsy diagnosis systems is increasingly important. Epilepsy diagnosis traditionally relies on analyzing EEG signals, with recent deep learning methods gaining prominence due to their ability to bypass manual feature extraction.

View Article and Find Full Text PDF

ModeHunter is a modular Python software package for the simulation of 3D biophysical motion across spatial resolution scales using modal analysis of elastic networks. It has been curated from our in-house Python scripts over the last 15 years, with a focus on detecting similarities of elastic motion between atomic structures, coarse-grained graphs, and volumetric data obtained from biophysical or biomedical imaging origins, such as electron microscopy or tomography. With ModeHunter, normal modes of biophysical motion can be analyzed with various static visualization techniques or brought to life by dynamics animation in terms of single or multimode trajectories or decoy ensembles.

View Article and Find Full Text PDF

We present PepFuNN, a new open-source version of the PepFun package with functions to study the chemical space of peptide libraries and perform structure-activity relationship analyses. PepFuNN is a Python package comprising five modules to study peptides with natural amino acids and, in some cases, sequences with non-natural amino acids based on the availability of a public monomer dictionary. The modules allow calculating physicochemical properties, performing similarity analysis using different peptide representations, clustering peptides using molecular fingerprints or calculated descriptors, designing peptide libraries based on specific requirements, and a module dedicated to extracting matched pairs from experimental campaigns to guide the selection of the most relevant mutations in design new rounds.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!