Whole-Function Vectorization
In this chapter, we present the main transformation phases of the Whole- Function Vectorization algorithm: Mask Generation, Select Generation, Partial CFG Linearization, and Instruction Vectorization.
- PDF / 1,570,034 Bytes
- 193 Pages / 419.52 x 595.2 pts Page_size
- 33 Downloads / 173 Views
Ralf Karrenberg
Automatic SIMD Vectorization of SSA-based Control Flow Graphs With a Preface by Dr. Ayal Zaks
Ralf Karrenberg Saarbrücken, Germany Dissertation, Saarland University, 2015
ISBN 978-3-658-10112-1 ISBN 978-3-658-10113-8 (eBook) DOI 10.1007/978-3-658-10113-8 Library of Congress Control Number: 2015938765 Springer Vieweg © Springer Fachmedien Wiesbaden 2015 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, speci¿cally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on micro¿lms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a speci¿c statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper Springer Vieweg is a brand of Springer Fachmedien Wiesbaden Springer Fachmedien Wiesbaden is part of Springer Science+Business Media (www.springer.com)
Foreword I first met Ralf four years ago on stage at the magnificent “Le Majestic” congress center in Chamonix, France. This was in April 2011, and we switched connections at the podium as Ralf completed his talk on WholeFunction Vectorization at CGO 2011 and I was about to start my talk. Four years later, it is a pleasure to write this foreword to Ralf’s PhD thesis on this subject, which significantly extends and builds upon his earlier works. Ralf’s works have already attracted attention and inspired research and development in both academia and industry, and this thesis will undoubtedly serve as a major reference for developers and researches involved in the field. Ralf’s original Automatic Packetization Master’s thesis from July 2009 has inspired early OpenCL tools such as the Intel OpenCL SDK Vectorizer presented by Nadav Rotem at LLVM’s November 2011 Developer’s Meeting. Ralf Karrenberg and Sebastian Hack’s CC 2012 paper on Improving Performance of OpenCL on CPUs further extended their CGO 2011 paper. Recently, Hee-Seok Kim et al. from UIUC refer to the aforementioned papers in their CGO 2015 paper on Locality-centric thread scheduling for bulksynchronous programming models on CPU architectures, which effectively maximizes the vectorization factors for kernel functions where possible and profitable. In addition, Yunsup Lee et al. from UC Berkeley and NVIDIA also refer to the aforement