Jittor: a novel deep learning framework with meta-operators and unified graph execution

  • PDF / 730,859 Bytes
  • 21 Pages / 595 x 842 pts (A4) Page_size
  • 9 Downloads / 209 Views

DOWNLOAD

REPORT


. RESEARCH PAPER .

December 2020, Vol. 63 222103:1–222103:21 https://doi.org/10.1007/s11432-020-3097-4

Jittor: a novel deep learning framework with meta-operators and unified graph execution Shi-Min HU1,2* , Dun LIANG1 , Guo-Ye YANG1 , Guo-Wei YANG1 & Wen-Yang ZHOU1 1 Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China; Beijing National Research Center for Information Science and Technology, Beijing 100084, China

2

Received 25 August 2020/Accepted 23 September 2020/Published online 13 November 2020

Abstract This paper introduces Jittor, a fully just-in-time (JIT) compiled deep learning framework. With JIT compilation, we can achieve higher performance while making systems highly customizable. Jittor provides classes of Numpy-like operators, which we call meta-operators. A deep learning model built upon these meta-operators is compiled into high-performance CPU or GPU code in real-time. To manage metaoperators, Jittor uses a highly optimized way of executing computation graphs, which we call unified graph execution. This approach is as easy to use as dynamic graph execution yet has the efficiency of static graph execution. It also provides other improvements, including operator fusion, cross iteration fusion, and unified memory. Keywords deep learning framework, meta-operator, unified graph execution, JIT compilation, generative adversarial network Citation Hu S-M, Liang D, Yang G-Y, et al. Jittor: a novel deep learning framework with meta-operators and unified graph execution. Sci China Inf Sci, 2020, 63(12): 222103, https://doi.org/10.1007/s11432-020-3097-4

1

Introduction

In recent years, deep learning has developed rapidly. With big data science, deep learning has become a new paradigm for scientific research and engineering applications. As deep learning algorithms are typically complicated to implement, various deep learning libraries and frameworks have been developed, to provide researchers and developers with convenient ways to rapidly develop deep learning systems. The first of these was Torch, a modular machine learning software library [1]. In 2008, the MILA Laboratory of Montreal University, led by Yoshua Bengio, announced the Theano deep learning framework [2]. Its conceptual organisation has been adopted by subsequent deep learning frameworks. Python is used as a front-end language, while C, CUDA and other languages are used as back-end languages for acceleration, and computational graphs (also called dataflow graphs) provide a bridge between them. Later, several other frameworks were proposed, including Caffe [3], TensorFlow [4], and PyTorch [5]. Theano ceased to be maintained in 2017, and Caffe was merged with PyTorch in 2018. Thus, TensorFlow and PyTorch are the two main current frameworks used for deep learning; about 70% of CVPR 2020 deep learning papers used PyTorch. Over time, these frameworks have evolved to provide many new features, which have become very popular and been widely used. These include hardware acceleration, automatic differentiation,