Trends in Deep Learning

This chapter discusses some of the trends in deep learning and related fields. We cover specifically which trends might be useful for what tasks as well as discuss some of the methods and ideas that could have far-reaching implications but have yet to be

  • PDF / 466,361 Bytes
  • 23 Pages / 439.37 x 666.142 pts Page_size
  • 86 Downloads / 246 Views

DOWNLOAD

REPORT


Trends in Deep Learning This chapter discusses some of the trends in deep learning and related fields. We cover specifically which trends might be useful for what tasks as well as discuss some of the methods and ideas that could have far-reaching implications but have yet to be applied to many real-world problems. We finish by covering briefly some of the current limitations of deep learning as well as some other areas of AI that seem to hold promise for future AI applications, and discuss briefly some of the ethical and legal implications of deep learning applications.

Variations on Network Architectures One of the first trends in the field of deep learning was to build deeper networks with more layers to solve problems with increasing complexity. However, training such deep networks is difficult, as they are harder to optimize, and accuracy can degrade rather than improve. As mentioned in Chapter 1, Microsoft released a network structure in 2015 that builds on the concept of residual learning with their architecture called ResNet (He, Zhang, Ren, & Sun, 2015). Instead of trying to learn a direct mapping of the underlying relationship between an input and output within the network, the difference or residual between the two is learned. With this concept,

© Mathew Salvaris, Danielle Dean, Wee Hyong Tok 2018 M. Salvaris et al., Deep Learning with Azure, https://doi.org/10.1007/978-1-4842-3679-6_3

53

Chapter 3

Trends in Deep Learning

training of networks substantially deeper than previously used before became possible, with a network of 152 layers winning the 2015 ILSVRC competition on the ImageNet data. A class of networks called Inception networks alternatively focus on wide architectures where not all layers are simply stacked sequentially, aiming to increase both performance as well as computational efficiency of neural network models (Szegedy, Liu, et al., 2014).

Note  To accelerate development, practitioners should leverage network architectures from the research community such as Resnet-152 rather than trying to build and train CNNs from scratch.

Residual Networks and Variants There have been many suggested network architectures in recent years, and this trend continues to result in more network architecture choices. Many architectures rely on modifications to ResNets, such as ResNeXt, MultiResNet, and PolyNet (Abdi & Nahavandi, 2017; Xie, Girshick, Dollár, Zhuowen, & He, 2017; Zhang, Li, Loy, & Lin, 2017). Combining different types of approaches has also been considered such as Inception-ResNet (Szegedy, Ioffe, & Vanhoucke, 2016). In contrast, FractalNet is an extremely deep architecture that does not rely on residuals (Larsson, Maire, & Shakhnarovi, 2017).

DenseNet DenseNet is another popular network structure where each layer is connected to all other layers; its popularity lies in that it allows a substantial reduction in the number of parameters through feature reuse while alleviating a problem related to training of the networks called vanishing gradients (G. Huang, Liu, van der Maaten, & We

Data Loading...