Affiliation:
1. University of California
2. Saudi Data And Artificial Intelligence Authority
Abstract
Social media analysis over blogs (such as tweets) often requires determining top-k mentions of a certain category (e.g., movies) in a collection (e.g., tweets collected over a given day). Such queries require entity linking (EL) function to be executed that is often expensive. We propose TQEL, a framework that minimizes the joint cost of EL calls and top-k query processing. The paper presents two variants - TQEL-exact and TQEL-approximate that retrieve the exact / approximate top-k results. TQEL-approximate, using a weaker stopping condition, achieves significantly improved performance (with the fraction of the cost of TQEL-exact) while providing strong probabilistic guarantees (over 2 orders of magnitude lower EL calls with 95% confidence threshold compared to TQEL-exact). TQEL-exact itself is orders of magnitude better compared to a naive approach that calls EL functions on the entire dataset.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献