Analytical engines rely on in-memory data caching to avoid storage accesses and provide timely responses by keeping the most frequently accessed data in memory. Purely frequency- and time-based caching decisions, however, are a proxy of the expected query execution speedup only when storage accesses are significantly slower than in-memory query processing. On the other hand, fast storage offers loading times that approach fully in-memory query response times, rendering purely frequency-based statistics incapable of capturing the impact of a caching decision on query execution. For example, caching the input of a frequent query that spends most of its time processing joins is less beneficial than caching a page for a slightly less frequent but scan-heavy query. Thus, existing caching policies waste valuable memory space to cache input data that offer little-to-no acceleration for analytics. This paper proposes HPCache, a buffer management policy that enables fast analytics on high-bandwidth storage by efficiently using the available in-memory space. HPCache caches data based on the speedup potential instead of relying on frequency-based statistics. We show that, with fast storage, the benefit of in-memory caching varies significantly across queries; therefore, we quantify the efficiency of caching decisions and formulate an optimization problem. We implement HPCache in Proteus and show that (i) estimating speedup potential improves memory space utilization, and (ii) simple runtime statistics suffice to infer speedup. We show that HPCache achieves up to a 1.75x speed-up over frequency-based caching policies by caching column proportions and automatically tuning them. Overall, HPCache enables efficient use of the in-memory space for input caching in the presence of fast storage, without requiring workload predictions.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11639185PMC
http://dx.doi.org/10.1007/s00778-023-00828-7DOI Listing

Publication Analysis

Top Keywords

caching
12
fast storage
12
storage accesses
8
caching decisions
8
query execution
8
in-memory query
8
frequency-based statistics
8
caching policies
8
memory space
8
in-memory space
8

Similar Publications

Influence of an impacted estuary on the reproduction of an endangered endemic fish.

Sci Total Environ

January 2025

Aquatic Health Program, UC Davis, 1 Shields Ave, Davis, CA 95616, USA.

Health and nutrition of individuals are tied to reproductive success, which determines population viability. Environmental variability and anthropogenic effects can affect the health and nutrition of a species leading to reproductive repercussions which can hinder recovery of endangered populations. Indices of health and nutrition were examined for an imperiled species, delta smelt, Hypomesus transpacificus, in relation to their reproductive status to evaluate the effects of hydrologic conditions in the San Francisco Estuary and Sacramento-San Joaquin Delta.

View Article and Find Full Text PDF

Predictive Forwarding Rule Caching for Latency Reduction in Dynamic SDN.

Sensors (Basel)

December 2024

Department of Electrical Engineering & Computer Science, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu 42988, Republic of Korea.

In mission-critical environments such as industrial and military settings, the use of unmanned vehicles is on the rise. These scenarios typically involve a ground control system (GCS) and nodes such as unmanned ground vehicles (UGVs) and unmanned aerial vehicles (UAVs). The GCS and nodes exchange different types of information, including control data that direct unmanned vehicle movements and sensor data that capture real-world environmental conditions.

View Article and Find Full Text PDF

CPhaMAS: The first pharmacokinetic analysis cloud platform developed by China.

Zhong Nan Da Xue Xue Bao Yi Xue Ban

August 2024

Center of Clinical Pharmacology, Third Xiangya Hospital, Central South University, Changsha 410013.

Objectives: Software for pharmacological modeling and statistical analysis is essential for drug development and individualized treatment modeling. This study aims to develop a pharmacokinetic analysis cloud platform that leverages cloud-based benefits, offering a user-friendly interface with a smoother learning curve.

Methods: The platform was built using Rails as the framework, developed in Julia language, and employs PostgreSQL 14 database, Redis cache, and Sidekiq for asynchronous task management.

View Article and Find Full Text PDF

Replay as a Basis for Backpropagation Through Time in the Brain.

Neural Comput

January 2025

Department of Psychological and Brain Sciences, Indiana University Bloomington, Bloomington, IN 47405, U.S.A.

How episodic memories are formed in the brain is a continuing puzzle for the neuroscience community. The brain areas that are critical for episodic learning (e.g.

View Article and Find Full Text PDF

Professor Nicola Clayton is perhaps best known for her work on food-caching scrub jays. Her seminal 1998 paper, together with Anthony Dickinson, showed that scrub jays could remember what food they had cached, where and how long ago, suggesting memory ability that is 'episodic-like' in nature. Here, we present data from a previously unpublished study that sought to replicate and extend these findings.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!