Affiliation:
1. Bilingual Cognition and Development Lab, Center for Linguistics and Applied Linguistics , Guangdong University of Foreign Studies , Guangzhou , China
Abstract
Abstract
A central task in empirical and quantitative language studies is the extraction of linguistic constructions important to linguistic theory and application. The great number and variety of such constructions increasingly necessitates computer-assisted extraction, which often proves challenging as it entails a simultaneous analysis of multiple layers of linguistic information latent in large-scale corpora. To address this, we present Constraction, an open-source tool for the automatic extraction and interactive exploration of linguistic constructions from arbitrary textual corpora. Constraction features a generic algorithm that integrates customizable layers of linguistic annotation (e.g., lexical, syntactic, and semantic) to identify constructional patterns of varying sizes and abstraction levels. Its browser-based interface allows users to configure various extraction parameters and enables visual, interactive exploration of the extracted patterns. We demonstrate the utility of Constraction through case studies and discuss its potential applications in language research and pedagogy.
Subject
Linguistics and Language,Language and Linguistics
Reference44 articles.
1. Anthony, Laurence. 2022. AntConc [computer program]. Tokyo: Waseda University. https://www.laurenceanthony.net/software/antconc/ (accessed 1 May 2022).
2. BNC Consortium. 2007. The British national corpus, version 3 (BNC XML edition). Distributed by Bodleian Libraries, University of Oxford, on behalf of the BNC Consortium. http://www.natcorp.ox.ac.uk (accessed 16 February 2018).
3. Cappelle, Bert, Yury Shtyrov & Friedemann Pulvermüller. 2010. Heating up or cooling up the brain? MEG evidence that phrasal verbs are lexical units. Brain and Language 115(3). 189–201. https://doi.org/10.1016/j.bandl.2010.09.004.
4. Ciaramita, Massimiliano & Mark Johnson. 2003. Supersense tagging of unknown nouns in WordNet. In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, 168–175. Sapporo: Association for Computational Linguistics.
5. Culicover, Peter W., Ray Jackendoff & Jenny Audring. 2017. Multiword constructions in the grammar. Topics in Cognitive Science 9(3). 552–568. https://doi.org/10.1111/tops.12255.