ScholarLens: extracting competences from research publications for the automatic generation of semantic user profiles

Author:

Sateli Bahar1,Löffler Felicitas2,König-Ries Birgitta2,Witte René1

Affiliation:

1. Semantic Software Lab, Department of Computer Science and Software Engineering, Concordia University, Montreal, Quebec, Canada

2. Heinz-Nixdorf-Chair for Distributed Information Systems, Department of Mathematics and Computer Science, Friedrich Schiller University Jena, Jena, Germany

Abstract

Motivation Scientists increasingly rely on intelligent information systems to help them in their daily tasks, in particular for managing research objects, like publications or datasets. The relatively young research field of Semantic Publishing has been addressing the question how scientific applications can be improved through semantically rich representations of research objects, in order to facilitate their discovery and re-use. To complement the efforts in this area, we propose an automatic workflow to construct semantic user profiles of scholars, so that scholarly applications, like digital libraries or data repositories, can better understand their users’ interests, tasks, and competences, by incorporating these user profiles in their design. To make the user profiles sharable across applications, we propose to build them based on standard semantic web technologies, in particular the Resource Description Framework (RDF) for representing user profiles and Linked Open Data (LOD) sources for representing competence topics. To avoid the cold start problem, we suggest to automatically populate these profiles by analyzing the publications (co-)authored by users, which we hypothesize reflect their research competences. Results We developed a novel approach, ScholarLens, which can automatically generate semantic user profiles for authors of scholarly literature. For modeling the competences of scholarly users and groups, we surveyed a number of existing linked open data vocabularies. In accordance with the LOD best practices, we propose an RDF Schema (RDFS) based model for competence records that reuses existing vocabularies where appropriate. To automate the creation of semantic user profiles, we developed a complete, automated workflow that can generate semantic user profiles by analyzing full-text research articles through various natural language processing (NLP) techniques. In our method, we start by processing a set of research articles for a given user. Competences are derived by text mining the articles, including syntactic, semantic, and LOD entity linking steps. We then populate a knowledge base in RDF format with user profiles containing the extracted competences.We implemented our approach as an open source library and evaluated our system through two user studies, resulting in mean average precision (MAP) of up to 95%. As part of the evaluation, we also analyze the impact of semantic zoning of research articles on the accuracy of the resulting profiles. Finally, we demonstrate how these semantic user profiles can be applied in a number of use cases, including article ranking for personalized search and finding scientists competent in a topic —e.g., to find reviewers for a paper. Availability All software and datasets presented in this paper are available under open source licenses in the supplements and documented at http://www.semanticsoftware.info/semantic-user-profiling-peerj-2016-supplements. Additionally, development releases of ScholarLens are available on our GitHub page: https://github.com/SemanticSoftwareLab/ScholarLens.

Publisher

PeerJ

Subject

General Computer Science

Reference57 articles.

1. Analyzing user modeling on twitter for personalized news recommendations;Abel,2011

2. Determining expert profiles (with an application to expert finding);Balog,2007

3. Expertise retrieval;Balog;Foundation and Trends in Information Retrieval,2012

4. Publishing on the semantic web;Berners-Lee;Nature,2001

5. Latent dirichlet allocation;Blei;The Journal of Machine Learning Research,2003

Cited by 13 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Future Horizons;Advances in Educational Technologies and Instructional Design;2024-04-19

2. Reinforcement Learning for Expert Finding from Web Search Results;Studies in Computational Intelligence;2024

3. Conceptual model of knowledge management system for scholarly publication cycle in academic institution;VINE Journal of Information and Knowledge Management Systems;2022-12-08

4. Context injection in expert finding;Brazilian Symposium on Multimedia and Web;2022-11-07

5. Ontology-Based Linked Data to Support Decision-Making within Universities;Mathematics;2022-09-02

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3