Getting rid of the Chi-square and Log-likelihood tests for analysing vocabulary differences between corpora-Reference-Cited by-同舟云学术

Getting rid of the Chi-square and Log-likelihood tests for analysing vocabulary differences between corpora

Published:2018-01-07 Issue:22 Volume:22 Page:33
ISSN:2444-1449
Container-title:Quaderns de Filologia - Estudis Lingüístics
language:
Short-container-title:QF ELING

Author:

Bestgen Yves

Abstract

Log-likelihood and Chi-square tests are probably the most popular statistical tests used in corpus linguistics, especially when the research is aiming to describe the lexical variations between corpora. However, because this specific use of the Chi-square test is not valid, it produces far too many significant results. This paper explains the source of the problem (i.e., the non-independence of the observations), the reasons for which the usual solutions are not acceptable and which kinds of statistical test should be used instead. A corpus analysis conducted on the lexical differences between American and British English is then reported, in order to demonstrate the problem and to confirm the adequacy of the proposed solution. The last section presents the commands that can be used with WordSmith Tools, a very popular software for corpus processing, to obtain the necessary data for the adequate tests, as well as a very easy-to-use procedure in R, a free and easy to install statistical software, that performs these tests.

Publisher

Universitat de Valencia

Subject

Linguistics and Language,Language and Linguistics

Cited by 23 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Insights from lexical and syntactic analyses of a French for academic purposes assessment;Assessing Writing;2023-10

2. Using corpus methods to analyze modal verbs in government science communication on Twitter;Research Methods in Applied Linguistics;2023-04

3. LBiaP;International Journal of Corpus Linguistics;2023-02-23

4. Science advocacy in political rhetoric and actions;Environment Systems and Decisions;2022-08-19

5. A Corpus Study of Lexical Bundles Used Differently in Dissertations Abstracts Produced by Chinese and American PhD Students of Linguistics;Frontiers in Psychology;2022-06-30