Reprint of "Abstraction for data integration: Fusing mammalian molecular, cellular and phenotype big datasets for better knowledge extraction".

Comput Biol Chem

Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, United States; BD2K-LINCS Data Coordination and Integration Center, United States; Illuminating the Druggable Genome Knowledge Management Center, United States. Electronic address:

Published: December 2015

With advances in genomics, transcriptomics, metabolomics and proteomics, and more expansive electronic clinical record monitoring, as well as advances in computation, we have entered the Big Data era in biomedical research. Data gathering is growing rapidly while only a small fraction of this data is converted to useful knowledge or reused in future studies. To improve this, an important concept that is often overlooked is data abstraction. To fuse and reuse biomedical datasets from diverse resources, data abstraction is frequently required. Here we summarize some of the major Big Data biomedical research resources for genomics, proteomics and phenotype data, collected from mammalian cells, tissues and organisms. We then suggest simple data abstraction methods for fusing this diverse but related data. Finally, we demonstrate examples of the potential utility of such data integration efforts, while warning about the inherit biases that exist within such data.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.compbiolchem.2015.08.005DOI Listing

Publication Analysis

Top Keywords

data
12
data abstraction
12
data integration
8
big data
8
reprint "abstraction
4
"abstraction data
4
integration fusing
4
fusing mammalian
4
mammalian molecular
4
molecular cellular
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!