Efficient and Effective Academic Expert Finding on Heterogeneous Graphs through ( k , 𝒫)-Core based Embedding

Author:

Wang Yuxiang1ORCID,Liu Jun1ORCID,Xu Xiaoliang1ORCID,Ke Xiangyu2ORCID,Wu Tianxing3ORCID,Gou Xiaoxuan1ORCID

Affiliation:

1. Hangzhou Dianzi University, Zhejiang Province, China

2. Zhejiang University, Hangzhou Zhejiang Province, China

3. Southeast University, Nanjing, Jiangsu Province, China

Abstract

Expert finding is crucial for a wealth of applications in both academia and industry. Given a user query and trove of academic papers, expert finding aims at retrieving the most relevant experts for the query, from the academic papers. Existing studies focus on embedding-based solutions that consider academic papers’ textual semantic similarities to a query via document representation and extract the top- n experts from the most similar papers. Beyond implicit textual semantics, however, papers’ explicit relationships (e.g., co-authorship) in a heterogeneous graph (e.g., DBLP) are critical for expert finding, because they help improve the representation quality. Despite their importance, the explicit relationships of papers generally have been ignored in the literature. In this article, we study expert finding on heterogeneous graphs by considering both the explicit relationships and implicit textual semantics of papers in one model. Specifically, we define the cohesive ( k , 𝒫)-core community of papers w.r.t. a meta-path 𝒫 (i.e., relationship) and propose a ( k , 𝒫)-core based document embedding model to enhance the representation quality. Based on this, we design a proximity graph-based index (PG-Index) of papers and present a threshold algorithm (TA)-based method to efficiently extract top- n experts from papers returned by PG-Index. We further optimize our approach in two ways: (1) we boost effectiveness by considering the ( k , 𝒫)-core community of experts and the diversity of experts’ research interests, to achieve high-quality expert representation from paper representation; and (2) we streamline expert finding, going from “extract top- n experts from top- m ( m> n ) semantically similar papers” to “directly return top- n experts”. The process of returning a large number of top- m papers as intermediate data is avoided, thereby improving the efficiency. Extensive experiments using real-world datasets demonstrate our approach’s superiority.

Funder

National NSF of China

Primary R&D Plan of Zhejiang

Center-initiated Research Project of Zhejiang Lab

Fundamental Research Funds for the Provincial Universities of Zhejiang

Project for the Doctor of Entrepreneurship and Innovation in Jiangsu Province

Fundamental Research Funds for the Central Universities, and ZhiShan Young Scholar Program of Southeast University

Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Reference73 articles.

1. 2021. HuggingFace. Retrieved from https://github.com/huggingface/transformers. Accessed May 12 2021.

2. Fawaz Alarfaj, Udo Kruschwitz, David Hunter, and Chris Fox. 2012. Finding the right supervisor: Expert-finding in a university domain. In Proceedings of the NAACL. 1–6.

3. Zipf Distribution of U.S. Firm Sizes

4. Krisztian Balog, Leif Azzopardi, and Maarten De Rijke. 2006. Formal models for expert finding in enterprise corpora. In Proceedings of the SIGIR. 43–50.

5. Expertise Retrieval

Cited by 6 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Scalable Community Search over Large-scale Graphs based on Graph Transformer;Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval;2024-07-10

2. Routing-Guided Learned Product Quantization for Graph-Based Approximate Nearest Neighbor Search;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

3. Scalable Community Search with Accuracy Guarantee on Attributed Graphs;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

4. Efficient and effective (k,P)-core-based community search over attributed heterogeneous information networks;Information Sciences;2024-03

5. Random Walk-Based Community Key-Members Search Over Large Graphs;2023

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3