Learning Semantic Representations from Directed Social Links to Tag Microblog Users at Scale
-
Published:2020-03-18
Issue:2
Volume:38
Page:1-30
-
ISSN:1046-8188
-
Container-title:ACM Transactions on Information Systems
-
language:en
-
Short-container-title:ACM Trans. Inf. Syst.
Author:
Zhao Wayne Xin1,
Hou Yupeng1,
Chen Junhua1,
Zhu Jonathan J. H.2,
Yin Eddy Jing3,
Su Hanting4,
Wen Ji-Rong4
Affiliation:
1. Renmin University of China, Beijing, China
2. City University of Hong Kong, Hong Kong, China
3. Microsoft, China
4. Renmin University of China, China
Abstract
This article presents a network embedding approach to automatically generate tags for microblog users. Instead of using text data, we aim to annotate microblog users with meaningful tags by leveraging rich social link data. To utilize directed social links, we use two kinds of node representations for modeling user interest in terms of their followers and followees, respectively. To alleviate the sparsity problem, we propose a novel method based on two transformation functions for capturing implicit interest similarity. Different from previous works on capturing high-order proximity, our model is able to directly characterize the effect of the context user on the proximity of node pairs. Another novelty of our model is that the importance scores of users learned from the classic PageRank algorithm are utilized to set the link weights. By using such weights, our model is more capable of disentangling the interest similarity evidence of a link. We jointly consider the above factors when designing the final objective function.
We construct a very large evaluation set consisting of 2.6M users, 0.5M tags, and 0.8B following links. To our knowledge, it is the largest reported dataset for microblog user tagging in the literature. Extensive experiments on this dataset demonstrate the effectiveness of the proposed approach. We implement this approach with several optimization techniques, which makes our model easy to scale to very large social networks. Ubiquitous social links provide important data resources to understand user interests. Our work provides an effective and efficient solution to annotate user interests solely using the link data, which has important practical value in industry. To illustrate the use of our models, we implement a demonstration system for visualizing, navigating, and searching microblog users.
Funder
Research Funds of Renmin University of China
National Natural Science Foundation of China
Beijing Outstanding Young Scientist Program
Fundamental Research Funds for the Central Universities
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Science Applications,General Business, Management and Accounting,Information Systems
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献