Applying Collocation Analysis to Chinese Discourse: A Case Study of Causal Connectives

Author:

Wei Yipu1,Speelman Dirk2,Evers-Vermeul Jacqueline3

Affiliation:

1. School of Chinese as a Second Language , Peking University

2. Research Unit of Quantitative Lexicology and Variational Linguistics , University of Leuven

3. Utrecht Institute of Linguistics OTS , Utrecht University

Abstract

Abstract Collocation analysis can be used to extract meaningful linguistic information from large-scale corpus data. This paper reviews the methodological issues one may encounter when performing collocation analysis for discourse studies on Chinese. We propose four crucial aspects to consider in such analyses: (i) the definition of collocates according to various parameters; (ii) the choice of analysis and association measures; (iii) the definition of the search span; and (iv) the selection of corpora for analysis. To illustrate how these aspects can be addressed when applying a Chinese collocation analysis, we conducted a case study of two Chinese causal connectives: yushi ‘that is why’ and yin’er ‘as a result’. The distinctive collocation analysis shows how these two connectives differ in volitionality, an important dimension of discourse relations. The study also demonstrates that collocation analysis, as an explorative approach based on large-scale data, can provide valuable converging evidence for corpus-based studies that have been conducted with laborious manual analysis on limited datasets.

Publisher

Walter de Gruyter GmbH

Subject

Ocean Engineering

Reference67 articles.

1. Biber, Douglas. 1993. Representativeness in corpus design. Literature and linguistic computing 8(4): 243–257.10.1093/llc/8.4.243

2. Boogaart, Ronny, Timothy Colleman, and Gijsbert Rutten. 2014. Constructions all the way everywhere: Four new directions in constructionist Research. In Extending the Scope of Construction Grammar, ed. Ronny Boogaart, Timothy Colleman, and Gijsbert Rutten, 1–14. Berlin: Mouton de Gruyter.10.1515/9783110366273.1

3. Carlson, Lynn and Daniel Marcu. 2001. Discourse tagging reference manual (ISI technical report. ISI-TR-545). Online: http://www.isi.edu/~marcu/discourse/.

4. Chen, Keh-Jiann, Chu-Ren Huang, Li-Ping Chang, and Hui-Li Hsu. 1996. Sinica Corpus: Design methodology for balanced corpora. In Proceedings of the 11th Pacific Asia Conference on Language, Information and Computation, ed. B.-S. Park and J.B. Kim, 167–176. Seoul: Kyung Hee University.

5. Chang, Pi-Chuan, Michel Galley, and Christopher D. Manning. 2008. Optimizing Chinese word segmentation for machine translation performance. Proceedings of the 3rd Workshop on Statistical Machine Translation, Columbus, Ohio, 224–232.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3