A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 176

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 316
Function: require_once

Multi-file dynamic compression method based on classification algorithm in DNA storage. | LitMetric

Multi-file dynamic compression method based on classification algorithm in DNA storage.

Med Biol Eng Comput

State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, 210096, Nanjing, China.

Published: December 2024

AI Article Synopsis

  • The rapid increase in data volume has led to the potential of DNA storage as a cost-effective solution, though high synthesis and sequencing costs pose challenges.
  • The study introduces a machine learning-based multi-file dynamic compression method to optimize compression ratios for different files, thereby lowering storage costs.
  • Results show that the k-nearest neighbor algorithm achieves over 85% accuracy and a compression rate of 30.85%, outpacing traditional methods and significantly reducing DNA storage costs, which may range from $0.48 to $3 billion per terabyte.

Article Abstract

The exponential growth in data volume has necessitated the adoption of alternative storage solutions, and DNA storage stands out as the most promising solution. However, the exorbitant costs associated with synthesis and sequencing impeded its development. Pre-compressing the data is recognized as one of the most effective approaches for reducing storage costs. However, different compression methods yield varying compression ratios for the same file, and compressing a large number of files with a single method may not achieve the maximum compression ratio. This study proposes a multi-file dynamic compression method based on machine learning classification algorithms that selects the appropriate compression method for each file to minimize the amount of data stored into DNA as much as possible. Firstly, four different compression methods are applied to the collected files. Subsequently, the optimal compression method is selected as a label, as well as the file type and size are used as features, which are put into seven machine learning classification algorithms for training. The results demonstrate that k-nearest neighbor outperforms other machine learning algorithms on the validation set and test set most of the time, achieving an accuracy rate of over 85% and showing less volatility. Additionally, the compression rate of 30.85% can be achieved according to k-nearest neighbor model, more than 4.5% compared to the traditional single compression method, resulting in significant cost savings for DNA storage in the range of $0.48 to 3 billion/TB. In comparison to the traditional compression method, the multi-file dynamic compression method demonstrates a more significant compression effect when compressing multiple files. Therefore, it can considerably decrease the cost of DNA storage and facilitate the widespread implementation of DNA storage technology.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s11517-024-03156-2DOI Listing

Publication Analysis

Top Keywords

compression method
28
dna storage
20
compression
13
multi-file dynamic
12
dynamic compression
12
machine learning
12
method
8
method based
8
compression methods
8
learning classification
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!