Knowledge and skills extraction from the job requirements texts

Author:

Nikolaev Ivan E.ORCID

Abstract

The analysis of the requirements for vacancies in the labor market shows that they are multi-level language constructions of several words with complex semantic relationships. The aim of the research is to develop a method for extracting short texts of knowledge and skills from texts of job requirements that have a complex organizational structure. The method consists in supplementing the structure of complex sentences with new relationships by means of a BERT neural network model trained on the texts of online vacancies and moving from a complex text to a set of simple word combinations. The process of additional training (finetunig) of BERT neural network models from the Sberbank AI laboratory on the texts of online vacancies is shown. Two mechanisms for adding new links between requirements words, taking into account knowledge from the subject area, are implemented: linear and through the addition of the parsing tree. In the course of the experiment, a comparative analysis was carried out for several combinations of the listed tools. The combination that showed the best result was 'the BERT + deeppavlov_syntax_parser model + a linear method of adding links'. The applicability of the method was demonstrated on the text corpus of online job requirements. The proposed method has shown higher efficiency than the rule-based approach, which involves the use of formal rules and grammar rules for natural language analysis. Using the method will allow you to quickly identify the key changes in the needs of the labor market at the level of requirements texts of individual knowledge and skills.

Publisher

Samara National Research University

Subject

General Medicine

Reference20 articles.

1. ESCO is a multilingual classification of European skills, competencies and occupations. https://esco.ec.europa.eu/en.

2. Burtsev M, Ahn L. Deep neural network model for the problem of named objects recognition. International Journal of Machine Learning and Computing. 2019.

3. Maslova MA, Dmitriev AS, Kholkin DO. Named Entity Recognition Methods in Russian language: Don Engineering Bulletin, 2021. No. 7(79).

4. Khakimova EM. Compound sentences in modern Russian language: the orthological aspect. Bulletin of SUSU. Series: Linguistics. 2013.

5. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A, Polosukhin I. Attention is all you need. In: Advances in neural information processing systems. 2017. 5998-6008.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3