Targeted Training for Multi-organization Recommendation-Reference-Cited by-同舟云学术

Targeted Training for Multi-organization Recommendation

Published:2023-07-14 Issue:3 Volume:1 Page:1-18
ISSN:2770-6699
Container-title:ACM Transactions on Recommender Systems
language:en
Short-container-title:ACM Trans. Recomm. Syst.

Author:

Tomlinson Kiran¹^ORCID,Wan Mengting²^ORCID,Lu Cao²^ORCID,Hecht Brent²^ORCID,Teevan Jaime²^ORCID,Yang Longqi²^ORCID

Affiliation:

1. Cornell University, USA

2. Microsoft, USA

Abstract

Making recommendations for users in diverse organizations ( orgs ) is a challenging task for workplace social platforms such as Microsoft Teams and Slack. The current industry-standard model training approaches either use data from all organizations to maximize information or train organization-specific models to minimize noise. Our real-world experiments show that both approaches are poorly suited for the multi-org recommendation setting where different organizations’ interaction patterns vary in their generalizability. We introduce targeted training , which improves on standard practices by automatically selecting a subset of orgs for model development whose data are cleanest and best represent global trends. We demonstrate how and when targeted training improves over global training through theoretical analysis and simulation. Our experiments on large-scale datasets from Microsoft Teams, SharePoint, Stack Exchange, DBLP, and Reddit show that in many cases targeted training can improve mean average precision (MAP) across orgs by 10–15% over global training, is more robust to orgs with lower data quality, and generalizes better to unseen orgs. Our training framework is applicable to a wide range of inductive recommendation models, from simple regression models to graph neural networks (GNNs).

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3603508

Reference58 articles.

1. Friends and neighbors on the Web

2. A Review of Hot Deck Imputation for Survey Non-response

3. A survey of cross-validation procedures for model selection

4. The Pushshift Reddit Dataset

5. Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights