Generating data as a proxy for unavailable corpus data: the contextualized sentence completion task
MetadataShow full item record
There is much interest in using large corpora to explore predictors of the probability of higher level linguistic structures, but suitable corpora are not available for all languages and their varieties. We explore a task that uses discourse contexts from an existing corpus as prompts for sentence completion to investigate the usefulness of the method for generating data as a proxy for unavailable corpus data. Mini databases of dative and genitive structures were obtained with the method using American and Australian participants. It is shown that the databases are indeed a good proxy for corpus data.
Corpus Linguistics and Linguistic Theory
Copyright 2015 Walter de Gruyter & Co. KG Publishers. The attached file is reproduced here in accordance with the copyright policy of the publisher. Please refer to the journal's website for access to the definitive, published version.
Linguistics not elsewhere classified