Abstract
AbstractNamed entity recognition as a fundamental task plays a crucial role in accomplishing some of the tasks and applications in natural language processing. In the age of Internet information, as far as computer applications are concerned, a huge proportion of information is stored in structured and unstructured forms and used for language and text processing. Before neural networks were widely used in natural language processing tasks, research in the field of named entity recognition usually focused on leveraging lexical and syntactic knowledge to improve the performance of models or methods. To promote the development of named entity recognition, researchers have been creating named entity recognition datasets through conferences, projects, and competitions for many years, based on various research goals, and training entity recognition models with increasing accuracy on this basis. However, there has not been much exploration of named entity recognition datasets. Particularly, there have been many datasets available since the introduction of the named entity recognition task, but there is no clear framework to summarize the development of these seemingly independent datasets. A closer look at the context of the development of each dataset and the features it contains reveals that these datasets share some common features to varying degrees. In this thesis, we review the development of named entity recognition datasets over the years and describe them in terms of the language of the dataset, the domain of research, the type of entity, the granularity of the entity, and the annotation of the entity. Finally, we provide an idea for the creation of subsequent named entity recognition datasets.
Funder
National Key Laboratory for Complex Systems Simulation Foundation
Publisher
Springer Science and Business Media LLC
Reference99 articles.
1. Nadeau, D., Sekine, S.: A survey of named entity recognition and classification[J]. Lingvisticae Investigationes 30(1), 3–26 (2007)
2. Grishman, R., Sundheim, B.M.: Message understanding conference-6: A brief history[C]. Coling: The 16th International Conference on Computational Linguistics 1 (1996)
3. Yadav, V., Bethard, S.: A survey on recent advances in named entity recognition from deep learning models[J]. arXiv preprint arXiv:1910.11470 (2019.
4. Goyal, A., Gupta, V., Kumar, M.: Recent named entity recognition and classification techniques: a systematic review[J]. Comput. Sci. Rev. 29, 21–43 (2018)
5. Li, J., Sun, A., Han, J., et al.: A survey on deep learning for named entity recognition[J]. IEEE Trans. Knowl. Data Eng.Knowl. Data Eng. 34(1), 50–70 (2020)
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献