Knowledge and skills extraction from the job requirements texts


Nikolaev Ivan E.ORCID


The analysis of the requirements for vacancies in the labor market shows that they are multi-level language constructions of several words with complex semantic relationships. The aim of the research is to develop a method for extracting short texts of knowledge and skills from texts of job requirements that have a complex organizational structure. The method consists in supplementing the structure of complex sentences with new relationships by means of a BERT neural network model trained on the texts of online vacancies and moving from a complex text to a set of simple word combinations. The process of additional training (finetunig) of BERT neural network models from the Sberbank AI laboratory on the texts of online vacancies is shown. Two mechanisms for adding new links between requirements words, taking into account knowledge from the subject area, are implemented: linear and through the addition of the parsing tree. In the course of the experiment, a comparative analysis was carried out for several combinations of the listed tools. The combination that showed the best result was 'the BERT + deeppavlov_syntax_parser model + a linear method of adding links'. The applicability of the method was demonstrated on the text corpus of online job requirements. The proposed method has shown higher efficiency than the rule-based approach, which involves the use of formal rules and grammar rules for natural language analysis. Using the method will allow you to quickly identify the key changes in the needs of the labor market at the level of requirements texts of individual knowledge and skills.


Samara National Research University


General Medicine

Reference20 articles.

1. ESCO is a multilingual classification of European skills, competencies and occupations.

2. Burtsev M, Ahn L. Deep neural network model for the problem of named objects recognition. International Journal of Machine Learning and Computing. 2019.

3. Maslova MA, Dmitriev AS, Kholkin DO. Named Entity Recognition Methods in Russian language: Don Engineering Bulletin, 2021. No. 7(79).

4. Khakimova EM. Compound sentences in modern Russian language: the orthological aspect. Bulletin of SUSU. Series: Linguistics. 2013.

5. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A, Polosukhin I. Attention is all you need. In: Advances in neural information processing systems. 2017. 5998-6008.







Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3