Storing information in memory efficiently is one of the most significant challenges in computer science. The two main factors that consist an efficient data structure is the reduction of space and time consumption. There is a plethora of different tools able to reduce the run-time of a process, and Apache Spark is one of these; it is a computing framework that is using clusters to execute a process. There are two key features in this software, a directed acyclic graph (DAG) that maps the execution process and the resilient distributed datasets (RDD), which allow large in-memory computations. In order to construct a data structure, which is space- and time-efficient, we have to utilize the corresponding framework. A comparison of the run-time improvement with the use of Spark is also provided. Finally, to prove the efficacy of this software tool, we construct a space-efficient data structure and compare the run-time with and without its use.

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-3-030-78775-2_19DOI Listing

Publication Analysis

Top Keywords

data structure
12
apache spark
8
improving run-time
4
run-time space-efficient
4
space-efficient n-gram
4
data
4
n-gram data
4
data structures
4
structures apache
4
spark storing
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!