KARAJ: An Efficient Adaptive Multi-Processor Tool to Streamline Genomic and Transcriptomic Sequence Data Acquisition.

Mahdieh Labani Amin Beheshti Nigel H Lovell Hamid Alinejad-Rokny Ali Afrasiabi

Int J Mol Sci

Biomedical Machine Learning Lab, The Graduate School of Biomedical Engineering, University of New South Wales (UNSW), Sydney, NSW 2052, Australia.

Published: November 2022

Here we developed , a fast and flexible Linux command-line tool to automate the end-to-end process of querying and downloading a wide range of genomic and transcriptomic sequence data types. The input to KARAJ is a list of PMCIDs or publication URLs or various types of accession numbers to automate four tasks as follows; firstly, it provides a summary list of accessible datasets generated by or used in these scientific articles, enabling users to select appropriate datasets; secondly, calculates the size of files that users want to download and confirms the availability of adequate space on the local disk; thirdly, it generates a metadata table containing sample information and the experimental design of the corresponding study; and lastly, it enables users to download supplementary data tables attached to publications. Further, provides a parallel downloading framework powered by which reduces the downloading time significantly.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9694301	PMC
http://dx.doi.org/10.3390/ijms232214418	DOI Listing

Publication Analysis

Top Keywords

genomic transcriptomic

transcriptomic sequence

sequence data

users download

karaj efficient

efficient adaptive

adaptive multi-processor

multi-processor tool

tool streamline

streamline genomic

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!