Affiliation:
1. Zhengzhou University
2. Chinese Academy of Sciences
Abstract
Abstract
Semantic folding theory (SFT) is an emerging cognitive science theory that aims to explain how the human brain processes and organizes semantic information, and the distribution of text into semantic grids is key to SFT. We proposed the Sentence-Level Semantic Division Baseline with 100 grids (SSDB − 100), the only dataset we are currently aware of that performs a relevant validation of the sentence-level semantic folding theory algorithm, to test the validity of text distribution in semantic grids. In this article, we describe the construction of SSDB-100. Firstly, a semantic division questionnaire with broad coverage was generated by limiting the uncertainty range of the topics and the corpus. Then, through an expert survey, 11 human experts gave us feedback. Finally, we analyzed and processed the feedback, the average consistency index for the used feedback was 0.856 after eliminating the invalid feedback. With 100 semantic grids and 3215 sentences, the SSDB-100 is not only suitable for verifying semantic folding algorithms, but also for text clustering tasks.
Publisher
Research Square Platform LLC
Reference26 articles.
1. Unity and diversity in human language;Fitch WT,2011
2. Language Representation in the Human Brain: Evidence from Cortical Mapping;Bhatnagar SC;Brain Lang,2000
3. The neurobiology of language beyond single-word processing;Hagoort P;Science,2019
4. Why Can't a Computer be more Like a Brain?;Hawkins J;IEEE Spectr,2007
5. Hawkins J et al (2020) Biological and Machine Intelligence. https://numenta.com/resources/biological-and-machine-intelligence/