Bilinear pyramid network for flower species categorization

  • PDF / 1,262,601 Bytes
  • 11 Pages / 439.642 x 666.49 pts Page_size
  • 82 Downloads / 204 Views

DOWNLOAD

REPORT


Bilinear pyramid network for flower species categorization Cheng Pang1

· Wenhao Wang1 · Rushi Lan1 · Zhuo Shi1 · Xiaonan Luo1

Received: 26 September 2019 / Revised: 14 June 2020 / Accepted: 20 August 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract It is a challenging task to distinguish between numerous species of flowers due to their visually similarities and variations of the pose and structure. Thanks to properly modeling of the local feature interactions, bilinear CNN has succeeded in classifying of many non-rigid fine-grained species including flowers. However, bilinear CNN only computes the feature in a straightforward way without exploring the interactions between features from multiple layers in the network. In this paper, we present a novel Bilinear Pyramid Network (BPN) for flower categorization. Instead of passing through the network and directly feeding the final classifier, features from a convolutional layer are resized and multiplied with that from the former layer, which alternates multiple times to generates prediction vectors using the features from distinct layers. These features encoded from the feature pyramid spontaneously carry multi-level semantic cues, which yields stronger discriminative powers than singlelayer features. Experiments show that the proposed network obtains superior classification results on the challenging dataset of flowers. Keywords Fine-grained image classification · Fine-grained visual categorization (FGVC) · Image classification · Convolutional neural network (CNN) · Deep learning

1 Introduction Fine-grained visual categorization refers to differentiating species from the same basic-level categories (e.g., species of dogs and models of cars) [4–7, 10, 12, 19, 30, 33]. Different from basic-level categories, fine-grained species present extremely high visual similarities and large pose variations, and often belong to considerably numerous species. As a result, classifying fine-grained species with large intra-class variations and small between-class differences is ambitious.

 Rushi Lan

[email protected] 1

School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, 541004, China

Multimedia Tools and Applications

Researches on cognitive science indicate that basic-level categories are distinguished by their differences of the body parts while fine-grained species are differentiated via the different properties of the same parts [24]. Follow this hypothesis, many part-based methods have gained their success in classifying rigid objects like birds, dogs and cars. In [15], a HSnet is proposed in which the classification problem is formulated as a sequential search for informative parts over a deep feature map produced by a CNN. In [29], part matching is introduced in traditional Bag-of-Words pipeline to inference discriminative foregrounds of the objects and eliminate background noises. In [32], multiscale part proposals are generated from object proposals, and then filtered to form the