BIOZON: a system for unification, management and analysis of heterogeneous biological data.

BMC Bioinformatics

Department of Computer Science, Cornell University, Ithaca, NY, USA.

Published: February 2006

Background: Integration of heterogeneous data types is a challenging problem, especially in biology, where the number of databases and data types increase rapidly. Amongst the problems that one has to face are integrity, consistency, redundancy, connectivity, expressiveness and updatability.

Description: Here we present a system (Biozon) that addresses these problems, and offers biologists a new knowledge resource to navigate through and explore. Biozon unifies multiple biological databases consisting of a variety of data types (such as DNA sequences, proteins, interactions and cellular pathways). It is fundamentally different from previous efforts as it uses a single extensive and tightly connected graph schema wrapped with hierarchical ontology of documents and relations. Beyond warehousing existing data, Biozon computes and stores novel derived data, such as similarity relationships and functional predictions. The integration of similarity data allows propagation of knowledge through inference and fuzzy searches. Sophisticated methods of query that span multiple data types were implemented and first-of-a-kind biological ranking systems were explored and integrated.

Conclusion: The Biozon system is an extensive knowledge resource of heterogeneous biological data. Currently, it holds more than 100 million biological documents and 6.5 billion relations between them. The database is accessible through an advanced web interface that supports complex queries, "fuzzy" searches, data materialization and more, online at http://biozon.org.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1449871PMC
http://dx.doi.org/10.1186/1471-2105-7-70DOI Listing

Publication Analysis

Top Keywords

data types
16
data
10
biozon system
8
heterogeneous biological
8
biological data
8
knowledge resource
8
biozon
5
biological
5
system unification
4
unification management
4

Similar Publications

Objective: To understand the current status and analyse the factors influencing frailty in older adults patients with pulmonary tuberculosis.

Methods: This retrospective case-control study included 204 older adults patients with pulmonary tuberculosis. The enrolled patients were divided into a frailty group (n = 101) and a non-frailty group (n = 103).

View Article and Find Full Text PDF

Introduction: The establishment of a high-throughput quantification approach for waterborne pathogenic protozoa and helminths is crucial for rapid screening and health risk assessment.

Methods: We developed a high-throughput quantitative polymerase chain reaction (HT-qPCR) assay targeting 19 waterborne protozoa and 3 waterborne helminths and validated its sensitivity, specificity, and repeatability. The assay was then applied to test various environmental media samples.

View Article and Find Full Text PDF

OpenNavSense platform: A low-cost, open-source inertial navigation system for the evaluation of estimation algorithms.

HardwareX

March 2025

Instituto de Investigacion Astronomico y Aeroespacial Pedro Paulet, Universidad Nacional de San Agustin de Arequipa, 04000, Arequipa, Peru.

Inertial navigation systems (INS) are widely used in commercial aviation, maritime navigation, and unmanned vehicle guidance. However, these systems are often sensitive, costly, and challenging to access. To address these limitations, an open-source, low-cost platform named INS OpenNavSense has been developed.

View Article and Find Full Text PDF

Disentangling protein metabolic costs in human cells and tissues.

PNAS Nexus

January 2025

Logic of Genomic Systems Laboratory (CNB-CSIC), Madrid E-28049, Spain.

While more data are becoming available on gene activity at different levels of biological organization, our understanding of the underlying biology remains incomplete. Here, we introduce a metabolic efficiency framework that considers highly expressed proteins (HEPs), their length, and biosynthetic costs in terms of the amino acids (AAs) they contain to address the observed balance of expression costs in cells, tissues, and cancer transformation. Notably, the combined set of HEPs in either cells or tissues shows an abundance of large and costly proteins, yet tissues compensate this with short HEPs comprised of economical AAs, indicating a stronger tendency toward mitigating costs.

View Article and Find Full Text PDF

Depression-related innate immune genes and pan-cancer gene analysis and validation.

Front Genet

January 2025

Department of Pharmacology, The Key Laboratory of Neural and Vascular Biology, The Key Laboratory of New Drug Pharmacology and Toxicology, Ministry of Education, Collaborative Innovation Center of Hebei Province for Mechanism, Diagnosis and Treatment of Neuropsychiatric Diseases, Hebei Medical University, Shijiazhuang, Hebei, China.

Background: Depression, a prevalent chronic mental disorder, presents complexities and treatment challenges that drive researchers to seek new, precise therapeutic targets. Additionally, the potential connection between depression and cancer has garnered significant attention.

Methods: This study analyzed depression-related gene expression data from the GEO database.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!