Filaments of Meaning in Word Space

Word space models, in the sense of vector space models built on distributional data taken from texts, are used to model semantic relations between words. We argue that the high dimensionality of typical vector space models lead to unintuitive effects on m

PDF / 316,440 Bytes
8 Pages / 430 x 660 pts Page_size
14 Downloads / 202 Views

DOWNLOAD

REPORT

Abstract. Word space models, in the sense of vector space models built on distributional data taken from texts, are used to model semantic relations between words. We argue that the high dimensionality of typical vector space models lead to unintuitive eﬀects on modeling likeness of meaning and that the local structure of word spaces is where interesting semantic relations reside. We show that the local structure of word spaces has substantially diﬀerent dimensionality and character than the global space and that this structure shows potential to be exploited for further semantic analysis using methods for local analysis of vector space structure rather than globally scoped methods typically in use today such as singular value decomposition or principal component analysis.

1

Vector Space Models

Vector space models are frequently used in information access, both for research experiments and as a building block for systems in practical use. There are numerous implementations of methods for modeling topical variation in text using vector spaces. These and related methods are used for information access or knowledge organisation of various levels of abstraction, all more or less based on quasi-geometric interpretations of distributional data of words in documents. Vector space models in various forms have been implicit in information retrieval practice at least since the early 1970’s and their origin has usually been attributed to the work of Gerard Salton. His 1975 paper titled “A vector space model for automatic indexing” [1], often cited as the ﬁrst vector space model, does not in fact make heavy use of vector spaces, but in his later publications the processing model was given more prominence as a convenient tool for topical modeling (see e.g. Dubin for a survey [2]). The vector space model has since become a staple in information retrieval experimentation and implementation. Distributional data collected from observation of linguistic data can be modeled in many ways, yielding probabilistic language models as well as vector space models. Vector space models have attractive qualities: processing vector spaces is a manageable implementational framework, they are mathematically welldeﬁned and understood, and they are intuitively appealing, conforming to everyday metaphors such as “near in meaning”. In this way, vector spaces can be interpreted as a model of meaning, as semantic spaces. In this sense, the term “word space” is ﬁrst introduced by Hinrich Sch¨ utze: “Vector similarity is the only information present in Word Space: semantically related words are close, unrelated words are distant” [3]. While there is some precedent to this deﬁnition C. Macdonald et al. (Eds.): ECIR 2008, LNCS 4956, pp. 531–538, 2008. c Springer-Verlag Berlin Heidelberg 2008

532

J. Karlgren, A. Holst, and M. Sahlgren

in linguistic and philosophical literature, none of the classic claims in fact give license to construct spatial models of meaning: to do so, we ﬁrst must examine how the model we build in fact preserves and represent

Data Loading...

Filaments of Meaning in Word Space

Recommend Documents

Semantics: Word and sentence meaning

The generally covariant meaning of space distances

The Meaning of Meaning

Actin Filaments

Emotional Voice Intonation: A Communication Code at the Origins of Speech Processing and Word-Meaning Associations?

Combining Word Semantics within Complex Hilbert Space for Information Retrieval

Caudal Filaments

Introduction to Vortex Filaments in Equilibrium

Learning to Answer Word-Meaning-Explanation Questions for Chinese Gaokao Reading Comprehension

Early Word Recognition and Word Learning in Mandarin Learning Children

Theory of conductive filaments in threshold switches

Vortex Pair of Coaxial Helical Filaments