A fast machine learning dataloader for epigenetic tracks from BigWig files.

Bioinformatics

Machine Learning Research, Pfizer Worldwide Research Development and Medical, Friedrichstraße 110, Berlin 10117, Germany.

Published: January 2024

Summary: We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics data. Using a new codec, the decompression can be done on a graphical processing unit (GPU) making it fast enough to create the training batches during training, mitigating the need for saving preprocessed training examples to disk.

Availability And Implementation: The bigwig-loader installation instructions and source code can be accessed at https://github.com/pfizer-opensource/bigwig-loader.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10782802PMC
http://dx.doi.org/10.1093/bioinformatics/btad767DOI Listing

Publication Analysis

Top Keywords

bigwig files
12
machine learning
8
create training
8
training batches
8
fast machine
4
learning dataloader
4
dataloader epigenetic
4
epigenetic tracks
4
tracks bigwig
4
files summary
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!