Multiword Structures in Different Materials, and with Different Goals and Methodologies

Following an overview of early frequency-based research of recurring word combinations and patterns, three current methods within SLA focusing spoken and written production are presented. Studies within each of these methodological paradigms are compared

  • PDF / 328,007 Bytes
  • 27 Pages / 439.37 x 666.14 pts Page_size
  • 65 Downloads / 186 Views

DOWNLOAD

REPORT


1

Introduction

Combinations of words that fulfill specific functions have come to be known by the term ‘formulaic language’. Instantiations of formulaic language have been referred to by different names in the literature, such as formulaic sequences, multiword units, prefabricated patterns. We have opted for the term multiword structures because these instantiations typically have identifiable structural characteristics, e.g. phrases as in do a jigsaw puzzle, and full-length clauses as in Could you give me a hand?, or functional characteristics such as hesitation markers, for example sort of, I guess, and discourse markers as in so you’re saying (that). Research on formulaic language has been performed under different conditions and with different goals. Some approaches focus on specific multiword structures, while others use holistic methods, scanning entire texts for multiword structures. Three current methods are presented and compared from qualitative aspects, such as size of material, amount of manual work involved, control of task, topic and discipline. This is followed by a presentation of a small-scale study using two of these methods, one automatic and one manual, applied to the same material of L1 and L2 English and Spanish. Both methods have been developed for the analysis of entire texts. In the review of literature we will refer to all the instantiations of formulaic language as multiword structures (henceforth MWSs), except where especially indicated.

B. Erman (*) • M. Lewis Department of English, Stockholm University, Stockholm, Sweden e-mail: [email protected] L. Fant Department of Spanish and Portuguese, Stockholm University, Stockholm 10691, Sweden J. Romero-Trillo (ed.), Yearbook of Corpus Linguistics and Pragmatics 2013: New Domains and Methodologies, Yearbook of Corpus Linguistics and Pragmatics 1, DOI 10.1007/978-94-007-6250-3_5, © Springer Science+Business Media Dordrecht 2013

77

78

B. Erman et al.

Corpora have been collected for different purposes serving different functions. The review of literature below will in the main concern studies of native and nonnative spoken and written production within SLA research. We discuss the studies according to: (1) the size of the corpus; (2) the selection of MWSs; (3) the methodology used. Following a brief overview of the methodology of early frequency-based research on large corpora, studies representing three current methods within SLA are presented. The first of these involves studies of specifically selected MWSs or types of MWSs in speech and writing based on smaller specialized corpora of native and non-native speaker production using the phraseological method. We then present two methods which involve no pre-selection of items to be studied but are applied to entire texts, notably the lexical bundle method, and the comprehensive method. Finally, we describe an empirical small-scale study using these two methods applied to the same material, i.e. the spoken production of advanced non-native Swedish speakers of English and Spanish