Evolutionary multi-level acyclic graph partitioning
- PDF / 929,611 Bytes
- 29 Pages / 439.37 x 666.142 pts Page_size
- 58 Downloads / 202 Views
Evolutionary multi-level acyclic graph partitioning Orlando Moreira1 · Merten Popp2
· Christian Schulz3
Received: 18 December 2018 / Revised: 7 July 2020 / Accepted: 9 July 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Directed graphs are widely used to model data flow and execution dependencies in streaming applications. This enables the utilization of graph partitioning algorithms for the problem of parallelizing execution on multiprocessor architectures under hardware resource constraints. However due to program memory restrictions in embedded multiprocessor systems, applications need to be divided into parts without cyclic dependencies. We found that this can be done by a subsequent second graph partitioning step with an additional acyclicity constraint. We have four main contributions. First, we show that this more constrained version of the graph partitioning problem is NP-complete and present linear time heuristics. We then integrate them into an existing multi-level graph partitioning framework to better handle large graphs. This achieves a 9% reduction of the edge cut compared to the previous single-level algorithm. Based on this, we engineer an evolutionary algorithm to further reduce the cut, achieving a 30% reduction on average compared to the state of the art. Finally, we integrate the partitioning heuristics into a graph compiler for an embedded multiprocessor architecture and show that this can reduce the amount of communication for a real-world imaging application and thereby accelerate it by an average of 11%. It is shown that the compiler can emit optimized code for vastly different hardware platforms using the heuristics. In addition, we demonstrate how a custom fitness function for the evolutionary algorithm can be used to optimize other objectives like load balancing if the communication volume is not predominantly important on a given hardware platform.
B
Merten Popp [email protected] Orlando Moreira [email protected] Christian Schulz [email protected]
1
GrAI Matter Labs, Eindhoven, The Netherlands
2
Braunschweig Institute of Technology, Brunswick, Germany
3
University of Vienna, Vienna, Austria
123
O. Moreira et al.
Keywords Graph partitioning · Evolutionary algorithm · Computer vision · Imaging · Embedded systems
1 Introduction Imaging and computer vision are important elements and enabling technologies for a variety of applications ranging from consumer electronics over industrial automation to self-driving cars and have high demands for computational power. However, these applications often need to run on embedded devices with strong constraints on power and with a tight thermal budget (Wolf 2014). To cope with the high performance and low latency requirements of the application domain under these limitations, specialized hardware is needed (Wolf 2017). The downside of these specializations is that the resulting hardware platform is cumbersome to program. Therefore the task of implementing an application will not on
Data Loading...