Efficient Algebraic Multigrid Methods for Multilevel Overlapping Coclustering of User-Item Relationships

Author:

Xu Haifeng1,Kashef Rasha F.2ORCID,De Sterck Hans3,Sanders Geoffrey4

Affiliation:

1. Department of Computer Science, University of Virginia, Charlottesville, Virginia 22903

2. Electrical, Computer, and Biomedical Engineering Department, Ryerson University, Toronto, Ontario M5B 2K3, Canada

3. Department of Applied Mathematics, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada

4. Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, California 94550

Abstract

Various digital data sets that encode user-item relationships contain a multilevel overlapping cluster structure. The user-item relation can be encoded in a weighted bipartite graph and uncovering these overlapping coclusters of users and items at multiple levels in the bipartite graph can play an important role in analyzing user-item data in many applications. For example, for effective online marketing, such as placing online ads or deploying smart online marketing strategies, identifying co-occurring clusters of users and items can lead to accurately targeted advertisements and better marketing outcomes. In this paper, we propose fast algorithms inspired by algebraic multigrid methods for finding multilevel overlapping cocluster structures of feature matrices that encode user-item relations. Starting from the weighted bipartite graph structure of the feature matrix, the algorithms use agglomeration procedures to recursively coarsen the bipartite graphs that represent the relations between the coclusters on increasingly coarser levels. New fast coarsening routines are described that circumvent the bottleneck of all-to-all similarity computations by exploiting measures of direct connection strength between row and column variables in the feature matrix. Providing accurate coclusters at multiple levels in a manner that can scale to large data sets is a challenging task. In this paper, we propose heuristic algorithms that approximately and recursively minimize normalized cuts to obtain coclusters in the aggregated bipartite graphs on multiple levels of resolution. Whereas the main novelty and focus of the paper lies in algorithmic aspects of reducing computational complexity to obtain scalable methods specifically for large rectangular user-item matrices, the algorithmic variants also define several new models for determining multilevel coclusters that we justify intuitively by relating them to principles that underlie collaborative filtering methods for user-item relationships. Experimental results show that the proposed algorithms successfully uncover the multilevel overlapping cluster structure for artificial and real data sets. Summary of Contribution: This paper develops new and efficient computational methods for finding the multilevel overlapping cocluster structure of feature matrices that encode user-item relationships. We base our approach on the use of pairwise similarity measures between features, seeking clusters of points that are similar to each other and dissimilar from the points outside the cluster. We approximately solve the problem of finding optimal overlapping coclusters on multiple levels by employing a framework that is based on efficient multilevel methods that have been used previously to solve sparse linear systems and to cluster graphs. Our main contribution is that we extend these methods in efficient manners to find coclusters in the bipartite graphs that encode common and important user-item relationships or social network relations. The novel methods that we propose are inherently scalable to large problem sizes and are naturally able to uncover overlapping coclusters at multiple levels, whereas existing methods generally only find coclusters at the fine level. We illustrate the algorithm and its performance on some standard test problems from the literature and on a proof-of-concept real-world data set that relates LinkedIn users to their skills and expertise.

Publisher

Institute for Operations Research and the Management Sciences (INFORMS)

Subject

General Engineering

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Uncovering block structures in large rectangular matrices;Journal of Multivariate Analysis;2023-11

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3