GenoVault is a cloud-based repository for handling Next Generation Sequencing (NGS) data. It is developed using OpenStack-based private cloud with various services like keystone for authentication, cinder for block storage, neutron for networking and nova for managing compute instances for the Cloud. GenoVault uses object-based storage, which enables data to be stored as objects instead of files or blocks for faster retrieval from different distributed object nodes. Along with a web-based interface, a JavaFX-based desktop client has also been developed to meet the requirements of large file uploads that are usually seen in NGS datasets. Users can store files in their respective object-based storage areas and the metadata provided by the user during file uploads is used for querying the database. GenoVault repository is designed taking into account future needs and hence can scale both vertically and horizontally using OpenStack-based cloud features. Users have an option to make the data shareable to the public or restrict the access as private. Data security is ensured as every container is a separate entity in object-based storage architecture which is also supported by Secure File Transfer Protocol (SFTP) for data upload and download. The data is uploaded by the user in individual containers that include raw read files (fastq), processed alignment files (bam, sam, bed) and the output of variation detection (vcf). GenoVault architecture allows verification of the data in terms of integrity and authentication before making it available to collaborators as per the user's permissions. GenoVault is useful for maintaining the organization-wide NGS data generated in various labs which is not yet published and submitted to public repositories like NCBI. GenoVault also provides support to share NGS data among the collaborating institutions. GenoVault can thus manage vast volumes of NGS data on any OpenStack-based private cloud.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8319889PMC
http://dx.doi.org/10.1186/s13040-021-00268-5DOI Listing

Publication Analysis

Top Keywords

ngs data
16
object-based storage
12
data
10
genovault
8
openstack-based private
8
private cloud
8
file uploads
8
ngs
5
genovault cloud
4
cloud based
4

Similar Publications

Among the cultivated crop species, the economically and culturally important grapevine plays host to the greatest number of distinctly characterized viruses. A critical component of the management and containment of these viral diseases in grapevine is both the identification of infected vines and the characterization of new pathogens. Next-generation high-throughput sequencing technologies, i.

View Article and Find Full Text PDF

MET Exon 14 Skipping and Novel Actionable Variants: Diagnostic and Therapeutic Implications in Latin American Non-Small-Cell Lung Cancer Patients.

Int J Mol Sci

December 2024

Centro de Genética y Genómica, Instituto de Ciencias e Innovación en Medicina, Facultad de Medicina Clínica Alemana Universidad del Desarrollo, Santiago 7550000, Chile.

Targeted therapy indications for actionable variants in non-small-cell lung cancer (NSCLC) have primarily been studied in Caucasian populations, with limited data on Latin American patients. This study utilized a 52-genes next-generation sequencing (NGS) panel to analyze 1560 tumor biopsies from NSCLC patients in Chile, Brazil, and Peru. The RNA sequencing reads and DNA coverage were correlated to improve the detection of the actionable exon 14 skipping variant (METex14).

View Article and Find Full Text PDF

G-protein-coupled receptors (GPCRs) have emerged as critical regulators of bone development and remodeling. In this study, we aimed to identify specific GPCR mutations in osteoporotic patients via next-generation sequencing (NGS). We performed NGS sequencing of six genomic DNA samples taken from osteoporotic patients and two genomic DNA samples from healthy donors.

View Article and Find Full Text PDF

L. and L. are valuable and promising food crops for multi-purpose use that are distributed worldwide in temperate, subtropical, and tropical zones.

View Article and Find Full Text PDF

The study investigated the application of humic acids (HAs) and a combination of humic acids and amino acids (HA+AA) in maize under field conditions. Based on preliminary data in the literature, the aim was to investigate the effects of the two plant conditioning compounds on plant physiological parameters. In addition to measuring plant physiological parameters in the field, a complete transcriptome analysis was performed to determine exactly which genes were expressed after the treatments and in which physiological processes they play a role.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!