Affiliation:
1. Hangzhou Dianzi University, Zhejiang Province, China
2. Zhejiang University, Hangzhou Zhejiang Province, China
3. Southeast University, Nanjing, Jiangsu Province, China
Abstract
Expert finding is crucial for a wealth of applications in both academia and industry. Given a user query and trove of academic papers, expert finding aims at retrieving the most relevant experts for the query, from the academic papers. Existing studies focus on embedding-based solutions that consider academic papers’ textual semantic similarities to a query via document representation and extract the top-
n
experts from the most similar papers. Beyond implicit textual semantics, however, papers’ explicit relationships (e.g., co-authorship) in a heterogeneous graph (e.g., DBLP) are critical for expert finding, because they help improve the representation quality. Despite their importance, the explicit relationships of papers generally have been ignored in the literature. In this article, we study expert finding on heterogeneous graphs by considering both the explicit relationships and implicit textual semantics of papers in one model. Specifically, we define the cohesive (
k
, 𝒫)-core community of papers w.r.t. a meta-path 𝒫 (i.e., relationship) and propose a (
k
, 𝒫)-core based document embedding model to enhance the representation quality. Based on this, we design a proximity graph-based index (PG-Index) of papers and present a threshold algorithm (TA)-based method to efficiently extract top-
n
experts from papers returned by PG-Index. We further optimize our approach in two ways: (1) we boost effectiveness by considering the (
k
, 𝒫)-core community of experts and the diversity of experts’ research interests, to achieve high-quality expert representation from paper representation; and (2) we streamline expert finding, going from “extract top-
n
experts from top-
m
(
m> n
) semantically similar papers” to “directly return top-
n
experts”. The process of returning a large number of top-
m
papers as intermediate data is avoided, thereby improving the efficiency. Extensive experiments using real-world datasets demonstrate our approach’s superiority.
Funder
National NSF of China
Primary R&D Plan of Zhejiang
Center-initiated Research Project of Zhejiang Lab
Fundamental Research Funds for the Provincial Universities of Zhejiang
Project for the Doctor of Entrepreneurship and Innovation in Jiangsu Province
Fundamental Research Funds for the Central Universities, and ZhiShan Young Scholar Program of Southeast University
Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province
Publisher
Association for Computing Machinery (ACM)
Reference73 articles.
1. 2021. HuggingFace. Retrieved from https://github.com/huggingface/transformers. Accessed May 12 2021.
2. Fawaz Alarfaj, Udo Kruschwitz, David Hunter, and Chris Fox. 2012. Finding the right supervisor: Expert-finding in a university domain. In Proceedings of the NAACL. 1–6.
3. Zipf Distribution of U.S. Firm Sizes
4. Krisztian Balog, Leif Azzopardi, and Maarten De Rijke. 2006. Formal models for expert finding in enterprise corpora. In Proceedings of the SIGIR. 43–50.
5. Expertise Retrieval
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献