From reactive to proactive load balancing for task-based parallel applications in distributed memory machines

Abstract

Summary Load balancing is often a challenge in task-parallel applications. The balancing problems are divided into static and dynamic. “Static” means that we have some prior knowledge about load information and perform balancing before execution, while “dynamic” must rely on partial information of the execution status to balance the load at runtime. Conventionally, work stealing is a practical approach used in almost all shared memory systems. In distributed memory systems, the communication overhead can make stealing tasks too late. To improve, people have proposed a reactive approach to relax communication in balancing load. The approach leaves one dedicated thread per process to monitor the queue status and offload tasks reactively from a slow to a fast process. However, reactive decisions might be mistaken in high imbalance cases. First, this article proposes a performance model to analyze reactive balancing behaviors and understand the bound leading to incorrect decisions. Second, we introduce a proactive approach to improve further balancing tasks at runtime. The approach exploits task‐based programming models with a dedicated thread as well, namely . Nevertheless, the main idea is to force not only to monitor load; it will characterize tasks and train load prediction models by online learning. “Proactive” indicates offloading tasks before each execution phase proactively with an appropriate number of tasks at once to a potential victim (denoted by an underloaded/fast process). The experimental results confirm speedup improvements from to in important use cases compared to the previous solutions. Furthermore, this approach can support co‐scheduling tasks across multiple applications. weitere

Mehr zum Titel

Titel From reactive to proactive load balancing for task-based parallel applications in distributed memory machines
Medien Concurrency and Computation: Practice and Experience
Heft 24
Band 35
ISSN 1532-0626, 1532-0634
Verfasser Minh Chung, Dr. Josef Weidendorfer, Karl Fürlinger, Prof. Dr. Dieter Kranzlmüller
Seiten e7828
Veröffentlichungsdatum 6.2023
Zitation Chung, Minh; Weidendorfer, Josef; Fürlinger, Karl; Kranzlmüller, Dieter (2023): From reactive to proactive load balancing for task-based parallel applications in distributed memory machines. Concurrency and Computation: Practice and Experience 35 (24), e7828. DOI: 10.1002/cpe.7828