Abstract
This study applies relative entropy in naturalistic large-scale corpus to calculate the difference among L2 (second language) learners at different levels. We chose lemma, token, POS-trigram, conjunction to represent lexicon and grammar to detect the patterns of language proficiency development among different L2 groups using relative entropy. The results show that information distribution discrimination regarding lexical and grammatical differences continues to increase from L2 learners at a lower level to those at a higher level. This result is consistent with the assumption that in the course of second language acquisition, L2 learners develop towards a more complex and diverse use of language. Meanwhile, this study uses the statistics method of time series to process the data on L2 differences yielded by traditional frequency-based methods processing the same L2 corpus to compare with the results of relative entropy. However, the results from the traditional methods rarely show regularity. As compared to the algorithms in traditional approaches, relative entropy performs much better in detecting L2 proficiency development. In this sense, we have developed an effective and practical algorithm for stably detecting and predicting the developments in L2 learners’ language proficiency.
Funder
H2020 European Research Council
Subject
General Physics and Astronomy
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献