Locality-Based Cache Management and Warp Scheduling for Reducing Cache Contention in GPU.

Micromachines (Basel)

Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.

Published: October 2021

GPGPUs has gradually become a mainstream acceleration component in high-performance computing. The long latency of memory operations is the bottleneck of GPU performance. In the GPU, multiple threads are divided into one warp for scheduling and execution. The L1 data caches have little capacity, while multiple warps share one small cache. That makes the cache suffer a large amount of cache contention and pipeline stall. We propose Locality-Based Cache Management (LCM), combined with the Locality-Based Warp Scheduling (LWS), to reduce cache contention and improve GPU performance. Each load instruction can be divided into three types according to locality: only used once as streaming data locality, accessed multiple times in the same warp as intra-warp locality, and accessed in different warps as inter-warp data locality. According to the locality of the load instruction, LWS applies cache bypass to the streaming locality request to improve the cache utilization rate, extend inter-warp memory request coalescing to make full use of the inter-warp locality, and combine with the LWS to alleviate cache contention. LCM and LWS can effectively improve cache performance, thereby improving overall GPU performance. Through experimental evaluation, our LCM and LWS can obtain an average performance improvement of 26% over baseline GPU.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8537857PMC
http://dx.doi.org/10.3390/mi12101262DOI Listing

Publication Analysis

Top Keywords

cache contention
16
warp scheduling
12
gpu performance
12
cache
10
locality-based cache
8
cache management
8
load instruction
8
data locality
8
locality accessed
8
improve cache
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!