The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies
-
Published:2020-01-13
Issue:1
Volume:7
Page:
-
ISSN:2052-4463
-
Container-title:Scientific Data
-
language:en
-
Short-container-title:Sci Data
Author:
Rzymski Christoph, Tresoldi TiagoORCID, Greenhill Simon J., Wu Mei-Shin, Schweikhard Nathanael E., Koptjevskaja-Tamm MariaORCID, Gast Volker, Bodt Timotheus A., Hantgan Abbie, Kaiping Gereon A., Chang Sophie, Lai Yunfan, Morozova Natalia, Arjava Heini, Hübler Nataliia, Koile Ezequiel, Pepper Steve, Proos Mariann, Van Epps Briana, Blanco Ingrid, Hundt Carolin, Monakhov Sergei, Pianykh Kristina, Ramesh Sallona, Gray Russell D.ORCID, Forkel RobertORCID, List Johann-Mattis
Abstract
AbstractAdvances in computer-assisted linguistic research have been greatly influential in reshaping linguistic research. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven questions can now be investigated. Such advances, however, are bringing high requirements in terms of rigorousness for preparing and curating datasets. Here we present CLICS, a Database of Cross-Linguistic Colexifications (CLICS). CLICS tackles interconnected interdisciplinary research questions about the colexification of words across semantic categories in the world’s languages, and show-cases best practices for preparing data for cross-linguistic research. This is done by addressing shortcomings of an earlier version of the database, CLICS2, and by supplying an updated version with CLICS3, which massively increases the size and scope of the project. We provide tools and guidelines for this purpose and discuss insights resulting from organizing student tasks for database updates.
Publisher
Springer Science and Business Media LLC
Subject
Library and Information Sciences,Statistics, Probability and Uncertainty,Computer Science Applications,Education,Information Systems,Statistics and Probability
Reference117 articles.
1. Atkinson, Q. D. & Gray, R. D. Curious parallels and curious connections: Phylogenetic thinking in biology and historical linguistics. Systematic Biol. 54, 513–526 (2005). 2. List, J.-M., Pathmanathan, J. S., Lopez, P. & Bapteste, E. Unity and disunity in evolutionary sciences: process-based analogies open common research avenues for biology and linguistics. Biol. Direct 11, 1–17 (2016). 3. Rama, T. Siamese convolutional networks for cognate identification. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 1018–1027 (Association for Comp. Linguist., 2016). 4. Rama, T. & List, J.-M. An automated framework for fast cognate detection and bayesian phylogenetic inference in computational historical linguistics. In 57th Annual Meeting of the Association for Computational Linguistics, 6225–6235 (Association for Comp. Linguist., 2019). 5. Blasi, D. E., Wichmann, S., Hammarström, H., Stadler, P. & Christiansen, M. H. Sound-meaning association biases evidenced across thousands of languages. P. Natl. Acad. Sci. USA 113, 10818–10823 (2016).
Cited by
72 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|