Semi-supervised deep learning based named entity recognition model to parse education section of resumes-Reference-Cited by-同舟云学术

Semi-supervised deep learning based named entity recognition model to parse education section of resumes

Published:2020-09-18 Issue:11 Volume:33 Page:5705-5718
ISSN:0941-0643
Container-title:Neural Computing and Applications
language:en
Short-container-title:Neural Comput & Applic

Author:

Gaur Bodhvi,Saluja Gurpreet Singh,Sivakumar Hamsa Bharathi,Singh Sanjay^ORCID

Abstract

AbstractA job seeker’s resume contains several sections, including educational qualifications. Educational qualifications capture the knowledge and skills relevant to the job. Machine processing of the education sections of resumes has been a difficult task. In this paper, we attempt to identify educational institutions’ names and degrees from a resume’s education section. Usually, a significant amount of annotated data is required for neural network-based named entity recognition techniques. A semi-supervised approach is used to overcome the lack of large annotated data. We trained a deep neural network model on an initial (seed) set of resume education sections. This model is used to predict entities of unlabeled education sections and is rectified using a correction module. The education sections containing the rectified entities are augmented to the seed set. The updated seed set is used for retraining, leading to better accuracy than the previously trained model. This way, it can provide a high overall accuracy without the need of large annotated data. Our model has achieved an accuracy of 92.06% on the named entity recognition task.

Funder

Manipal Academy of Higher Education, Manipal

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Software

Link

https://link.springer.com/content/pdf/10.1007/s00521-020-05351-2.pdf

Reference39 articles.

1. Ayishathahira C, Sreejith C, Raseek C (2018) Combination of neural networks and conditional random fields for efficient resume parsing. In: 2018 International CET conference on control, communication, and computing, IC4 2018, pp 388–393. Kerala, India. https://doi.org/10.1109/CETIC4.2018.8530883

2. Babar N (2017) The Levenshtein algorithm–Cuelogic,. Available: January 25, 2017. https://www.cuelogic.com/blog/the-levenshtein-algorithm. (Online; accessed 7-August-2019)

3. Bird S (2018) Natural language toolkit (nltk). https://github.com/nltk/nltk. Commit: 199c30c5cb5dbb46f5931c9d5b926617bc6a588f

4. Bird S, Klein E, Loper E (2009) Natural language processing with python: analyzing text with the natural language toolkit. O’Reilly Media, Sebastopol

5. Burr T, Skurikhin A (2015) Conditional random fields for pattern recognition applied to structured data. Algorithms 8(3):466–483. https://doi.org/10.3390/a8030466

Cited by 25 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Deep learning perspective on the construction of SPOC teaching model of music and dance in colleges and universities;Systems and Soft Computing;2024-12

2. Enhancing the hiring process: A predictive system for soft skills assessment;Data and Metadata;2024-09-02

3. Extracting section structure from resumes in Brazilian Portuguese;Expert Systems with Applications;2024-05

4. The power of a name: Exploring the relationship between ICO name fluency and investor decision making;International Review of Financial Analysis;2024-05

5. BiasEye: A Bias-Aware Real-time Interactive Material Screening System for Impartial Candidate Assessment;Proceedings of the 29th International Conference on Intelligent User Interfaces;2024-03-18