An Image Recognition Method Based on Scene Semantics
Aimed at the image recognition between scenes and objects, we propose a sort of image recognition method based on scene semantics (IRMSS). In IRMSS, the landmark objects of various scenes are collected to form a feature database named Symbolic Objects Dat
- PDF / 307,412 Bytes
- 8 Pages / 439.37 x 666.142 pts Page_size
- 88 Downloads / 252 Views
An Image Recognition Method Based on Scene Semantics Y Zhang and Y Ma Abstract Aimed at the image recognition between scenes and objects, we propose a sort of image recognition method based on scene semantics (IRMSS). In IRMSS, the landmark objects of various scenes are collected to form a feature database named Symbolic Objects Database and marked firstly; secondly, the remaining objects in the image could be identified one by one according to scene semantics known from the step forward; and thirdly, the scene of the image would be repeated validated and continuous concreted by using recognition results of each time to form a feedback system for the recognition of image semantics. At final, the simulation experiments showed that IRMSS could sharply promote the accuracy and efficiency of image semantic recognition in the case of strong semantic scene. Keywords Image recognition • Image semantics • Scene semantics
58.1 Introduction To the massive image data and many semantic content of the image, the image scene not only contains the general knowledge of an image, but also provides a basis to further identify the other elements of the image. Hence, it is how to make computer automatically classify the images into different semantic categories by simulating human cognitive understanding mechanism of the image recognition and identify image objects according to specific scene semantics that become the key issues. According to the different ways of describing images, the current scene classifications can be divided into two categories—based on global features and based on the “word bag model.” The early scene classification is based on global statistical
Y. Zhang (*) · Y. Ma College of Computer and Information Science, Chongqing Normal University, Chongqing 400050, China e-mail: [email protected]
Z. Zhong (ed.), Proceedings of the International Conference on Information Engineering and Applications (IEA) 2012, Lecture Notes in Electrical Engineering 217, DOI: 10.1007/978-1-4471-4850-0_58, © Springer-Verlag London 2013
461
Y. Zhang and Y. Ma
462
characteristics of image content to describe the scene, for example, Oliva et al. [1] proposed a set of visual perception attributes such as naturalness, wide, roughness, stretch degree, and dangerous. In recent years, the mainstream scene classification is to express image content by the Bag-Of-Word (BOW) [3, 4], and the basic idea of this approach is to firstly define the concept of the image blocks with different semantics called the visual and then use these visual word frequency as the content of the image scene to find the image most likely to belong to the scene category by using supervised learning methods. Based on the BOW model, Fei-fei [5] and Bosh [6] used latent Dirichlet allocation (LDA) [4] model and the probabilistic latent semantic analysis (pLSA) [5, 6] model to get the subject or latent semantics of the image to complete the scene classification of the image; Lazebnik et al. [6] used space pyramid model to extract the spatial distribution of visu
Data Loading...