Filaments of Meaning in Word Space

Word space models, in the sense of vector space models built on distributional data taken from texts, are used to model semantic relations between words. We argue that the high dimensionality of typical vector space models lead to unintuitive effects on m

  • PDF / 316,440 Bytes
  • 8 Pages / 430 x 660 pts Page_size
  • 14 Downloads / 188 Views

DOWNLOAD

REPORT


Abstract. Word space models, in the sense of vector space models built on distributional data taken from texts, are used to model semantic relations between words. We argue that the high dimensionality of typical vector space models lead to unintuitive effects on modeling likeness of meaning and that the local structure of word spaces is where interesting semantic relations reside. We show that the local structure of word spaces has substantially different dimensionality and character than the global space and that this structure shows potential to be exploited for further semantic analysis using methods for local analysis of vector space structure rather than globally scoped methods typically in use today such as singular value decomposition or principal component analysis.

1

Vector Space Models

Vector space models are frequently used in information access, both for research experiments and as a building block for systems in practical use. There are numerous implementations of methods for modeling topical variation in text using vector spaces. These and related methods are used for information access or knowledge organisation of various levels of abstraction, all more or less based on quasi-geometric interpretations of distributional data of words in documents. Vector space models in various forms have been implicit in information retrieval practice at least since the early 1970’s and their origin has usually been attributed to the work of Gerard Salton. His 1975 paper titled “A vector space model for automatic indexing” [1], often cited as the first vector space model, does not in fact make heavy use of vector spaces, but in his later publications the processing model was given more prominence as a convenient tool for topical modeling (see e.g. Dubin for a survey [2]). The vector space model has since become a staple in information retrieval experimentation and implementation. Distributional data collected from observation of linguistic data can be modeled in many ways, yielding probabilistic language models as well as vector space models. Vector space models have attractive qualities: processing vector spaces is a manageable implementational framework, they are mathematically welldefined and understood, and they are intuitively appealing, conforming to everyday metaphors such as “near in meaning”. In this way, vector spaces can be interpreted as a model of meaning, as semantic spaces. In this sense, the term “word space” is first introduced by Hinrich Sch¨ utze: “Vector similarity is the only information present in Word Space: semantically related words are close, unrelated words are distant” [3]. While there is some precedent to this definition C. Macdonald et al. (Eds.): ECIR 2008, LNCS 4956, pp. 531–538, 2008. c Springer-Verlag Berlin Heidelberg 2008 

532

J. Karlgren, A. Holst, and M. Sahlgren

in linguistic and philosophical literature, none of the classic claims in fact give license to construct spatial models of meaning: to do so, we first must examine how the model we build in fact preserves and represent