Affiliation:
1. University of Szeged, Hungary
2. Institute for Computer Science and Control, Hungarian Academy of Sciences (MTA SZTAKI)
3. University of Szeged, and MTA-SZTE Research Group on AI, Hungary
Abstract
Low-rank matrix approximation is an important tool in data mining with a wide range of applications, including recommender systems, clustering, and identifying topics in documents. When the matrix to be approximated originates from a large distributed system, such as a network of mobile phones or smart meters, a challenging problem arises due to the strongly conflicting yet essential requirements of efficiency, robustness, and privacy preservation. We argue that although collecting sensitive data in a centralized fashion may be efficient, it is not an option when considering privacy and efficiency at the same time. Thus, we do not allow any sensitive data to leave the nodes of the network. The local information at each node (personal attributes, documents, media ratings, etc.) defines one row in the matrix. This means that all computations have to be performed at the edge of the network. Known parallel methods that respect the locality constraint, such as synchronized parallel gradient search or distributed iterative methods, require synchronized rounds or have inherent issues with load balancing, and thus they are not robust to failure. Our distributed stochastic gradient descent algorithm overcomes these limitations. During the execution, any sensitive information remains local, whereas the global features (e.g., the factor model of movies) converge to the correct value at all nodes. We present a theoretical derivation and a thorough experimental evaluation of our algorithm. We demonstrate that the convergence speed of our method is competitive while not relying on synchronization and being robust to extreme and realistic failure scenarios. To demonstrate the feasibility of our approach, we present trace-based simulations, real smartphone user behavior analysis, and tests over real movie recommender system data.
Funder
“Momentum—Big Data” grant of the Hungarian Academy of Sciences
OTKA NK
Publisher
Association for Computing Machinery (ACM)
Subject
Artificial Intelligence,Theoretical Computer Science
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献