Generating data as a proxy for unavailable corpus data: the contextualized sentence completion task

View/ Open
File version
Version of Record (VoR)
Author(s)
Ford, Marilyn
Bresnan, Joan
Griffith University Author(s)
Year published
2015
Metadata
Show full item recordAbstract
There is much interest in using large corpora to explore predictors of the probability of higher level linguistic structures, but suitable corpora are not available for all languages and their varieties. We explore a task that uses discourse contexts from an existing corpus as prompts for sentence completion to investigate the usefulness of the method for generating data as a proxy for unavailable corpus data. Mini databases of dative and genitive structures were obtained with the method using American and Australian participants. It is shown that the databases are indeed a good proxy for corpus data.There is much interest in using large corpora to explore predictors of the probability of higher level linguistic structures, but suitable corpora are not available for all languages and their varieties. We explore a task that uses discourse contexts from an existing corpus as prompts for sentence completion to investigate the usefulness of the method for generating data as a proxy for unavailable corpus data. Mini databases of dative and genitive structures were obtained with the method using American and Australian participants. It is shown that the databases are indeed a good proxy for corpus data.
View less >
View less >
Journal Title
Corpus Linguistics and Linguistic Theory
Volume
11
Issue
1
Copyright Statement
© 2015 Walter de Gruyter & Co. KG Publishers. The attached file is reproduced here in accordance with the copyright policy of the publisher. Please refer to the journal's website for access to the definitive, published version.
Subject
Linguistics not elsewhere classified
Artificial Intelligence and Image Processing
Cognitive Sciences
Linguistics