Using a Technology for Identification of Semantically Connected Text Elements to Determine a Common Information Space
- PDF / 204,898 Bytes
- 10 Pages / 594 x 792 pts Page_size
- 50 Downloads / 149 Views
USING A TECHNOLOGY FOR IDENTIFICATION OF SEMANTICALLY CONNECTED TEXT ELEMENTS TO DETERMINE A COMMON INFORMATION SPACE S. V. Petrasova1† and N. F. Khairova1‡
UDC 004.912
Abstract. A technology is proposed that makes it possible to determine the common information space of actors of social networks by identifying the semantic equivalence of collocations in texts. The technology includes a model of formal description of semantic and grammatical characteristics of collocates, identification of collocations, and determination of a semantic equivalence predicate of two-word collocations. Keywords: semantic connectivity, information space, semantic and grammatical characteristic, semantic equivalence predicate, collocate, collocation. INTRODUCTION Social networks, forums, and blogs representing base objects of the modern information society become an important aspect of formation of an information space. The establishment and development of social relations in an information society are objective factors practically independent of personal characteristics of a person. Different types of (spatial, social, and information) contacts are simultaneously components of social relations and stages of their formation. Global information networks have become the environment and instrument of formation of information spaces of separate persons and stable social groups created based on mutual interests. In the general case, an information space is a product of intellectual activity of man that integrates information resources and their maintenance and application technologies functioning based on unified principles with a view to satisfying information needs of users [1]. At the present time, the main assessment of an information society becomes not only information but an efficient communication [2] carried out by means of the establishment of common information spaces of actors, i.e., subjects (individuals, social groups, organizations, and institutes) performing actions directed towards other actors. The establishment of such spaces is of an actual commercial and social value, for example, in the form of the development of advertizing destined for a target audience. In connection with continuous variations in an information community, the universality and heterogeneity of its information space is supplemented with continuous dynamism. Therefore, to adequately form information spaces of social communities, it is necessary to increase the level of automation of processing texts with including the solution of problems of semantic processing of resources representing definite information of individual actors [3]. Such text information items are, for example, the personal information of a person on domains of his interests, existing contacts, and demanded topics mentioned in blog and forum messages. The determination of some equivalence and identity of text data of actors that is performed with the help of approaches of Natural Language Processing makes it possible to extract common information spaces of definite social groups on the
Data Loading...