Abstract
AbstractOccupational data mining and analysis is an important task in understanding today’s industry and job market. Various machine learning techniques are proposed and gradually deployed to improve companies’ operations for upstream tasks, such as employee churn prediction, career trajectory modelling and automated interview. Job titles analysis and embedding, as the fundamental building blocks, are crucial upstream tasks to address these occupational data mining and analysis problems. A relevant occupational job title dataset is required to accomplish these tasks and towards that effort, we present the Industrial and Professional Occupations Dataset (IPOD). The IPOD dataset contains over 475,073 job titles based on 192,295 user profiles from a major professional networking site. To further facilitate these applications of occupational data mining and analysis, we proposeTitle2vec, a contextual job title vector representation using a bidirectional Language Model approach. To demonstrate the effectiveness ofTitle2vec, we also define an occupational Named Entity Recognition (NER) task and proposed two methods based on Conditional Random Fields (CRF) and bidirectional Long Short-Term Memory with CRF (LSTM-CRF). Using a large occupational job title dataset, experimental results show that both CRF and LSTM-CRF outperform human and baselines in both exact-match accuracy and F1 scores. The dataset and pre-trained embeddings have been made publicly available athttps://www.github.com/junhua/ipod.
Funder
Singapore University of Technology and Design
Publisher
Springer Science and Business Media LLC
Subject
Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems
Reference68 articles.
1. James C, Pappalardo L, Sîrbu A, Simini F. Prediction of next career moves from scientific profiles. arXiv preprint. 2018. arXiv:1802.04830.
2. Yang Y, Zhan D-C, Jiang Y. Which one will be next? An analysis of talent demission. 2018.
3. Zhao Y, Hryniewicki MK, Cheng F, Fu B, Zhu X. Employee turnover prediction with machine learning: a reliable approach. In: Proceedings of SAI intelligent systems conference. Springer;2018. p. 737–58.
4. Liu Y, Zhang L, Nie L, Yan Y, Rosenblum DS. Fortune teller: predicting your career path. In: Thirtieth AAAI conference on artificial intelligence. 2016.
5. Mimno D, McCallum A. Modeling career path trajectories. Citeseer; 2008.
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献