Motivation: The ability to centralize and store data for long periods on an end user's computational resources is increasingly difficult for many scientific disciplines. For example, genomics data is increasingly large and distributed, and the data needs to be moved into workflow execution sites ranging from lab workstations to the cloud. However, the typical user is not always informed on emerging network technology or the most efficient methods to move and share data. Thus, the user defaults to using inefficient methods for transfer across the commercial internet.

Results: To accelerate large data transfer, we created a tool called the Big Data Smart Socket (BDSS) that abstracts data transfer methodology from the user. The user provides BDSS with a manifest of datasets stored in a remote storage repository. BDSS then queries a metadata repository for curated data transfer mechanisms and optimal path to move each of the files in the manifest to the site of workflow execution. BDSS functions as a standalone tool or can be directly integrated into a computational workflow such as provided by the Galaxy Project. To demonstrate applicability, we use BDSS within a biological context, although it is applicable to any scientific domain.

Availability And Implementation: BDSS is available under version 2 of the GNU General Public License at https://github.com/feltus/BDSS .

Contact: ffeltus@clemson.edu.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5408802PMC
http://dx.doi.org/10.1093/bioinformatics/btw679DOI Listing

Publication Analysis

Top Keywords

data transfer
16
data
9
big data
8
data smart
8
smart socket
8
socket bdss
8
abstracts data
8
workflow execution
8
bdss
7
transfer
5

Similar Publications

Objective: Interhospital transfers for status epilepticus (SE) are common, and some are avoidable and likely lower yield. The use of interhospital transfer may differ in emergency department (ED) and inpatient settings, which contend with differing clinical resources and financial incentives. However, transfer from these two settings is understudied, leaving gaps in our ability to improve the hospital experience, cost, and triage for this neurologic emergency.

View Article and Find Full Text PDF

: To assess the ploidy status of embryos via preimplantation genetic testing for aneuploidy (PGT-A), a biopsy of trophectoderm (TE) cells can be performed. However, this approach is considered invasive, and therefore the aim of this study was to identify the optimal sample type and sampling day for non-invasive or minimally invasive PGT-A (ni/miPGT-A) in terms of data quality and concordance rates with TE biopsies derived from the same embryos. : This study was performed using 239 embryo cultures.

View Article and Find Full Text PDF

Transferring knowledge learned from standard GelSight sensors to other visuotactile sensors is appealing for reducing data collection and annotation. However, such cross-sensor transfer is challenging due to the differences between sensors in internal light sources, imaging effects, and elastomer properties. By understanding the data collected from each type of visuotactile sensors as domains, we propose a few-sample-driven style-to-content unsupervised domain adaptation method to reduce cross-sensor domain gaps.

View Article and Find Full Text PDF

Table Extraction with Table Data Using VGG-19 Deep Learning Model.

Sensors (Basel)

January 2025

Faculty of Science and Environmental Studies, Department of Computer Science, Lakehead University, Thunder Bay, ON P7B 5E1, Canada.

In recent years, significant progress has been achieved in understanding and processing tabular data. However, existing approaches often rely on task-specific features and model architectures, posing challenges in accurately extracting table structures amidst diverse layouts, styles, and noise contamination. This study introduces a comprehensive deep learning methodology that is tailored for the precise identification and extraction of rows and columns from document images that contain tables.

View Article and Find Full Text PDF

Android malware detection remains a critical issue for mobile security. Cybercriminals target Android since it is the most popular smartphone operating system (OS). Malware detection, analysis, and classification have become diverse research areas.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!