DNA Bloom Filter enables anti-contamination and file version control for DNA-based data storage.

Yiming Li Haoling Zhang Yuxin Chen Yue Shen Zhi Ping

Brief Bioinform

BGI Research, Shenzhen, 518083, China.

Published: March 2024

DNA storage is one of the most promising ways for future information storage due to its high data storage density, durable storage time and low maintenance cost. However, errors are inevitable during synthesizing, storing and sequencing. Currently, many error correction algorithms have been developed to ensure accurate information retrieval, but they will decrease storage density or increase computing complexity. Here, we apply the Bloom Filter, a space-efficient probabilistic data structure, to DNA storage to achieve the anti-error, or anti-contamination function. This method only needs the original correct DNA sequences (referred to as target sequences) to produce a corresponding data structure, which will filter out almost all the incorrect sequences (referred to as non-target sequences) during sequencing data analysis. Experimental results demonstrate the universal and efficient filtering capabilities of our method. Furthermore, we employ the Counting Bloom Filter to achieve the file version control function, which significantly reduces synthesis costs when modifying DNA-form files. To achieve cost-efficient file version control function, a modified system based on yin-yang codec is developed.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10981766	PMC
http://dx.doi.org/10.1093/bib/bbae125	DOI Listing

Publication Analysis

Top Keywords

bloom filter

file version

version control

data storage

dna storage

storage density

data structure

sequences referred

control function

storage

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!