Affiliation:
1. Lehigh University, Redmond, WA, USA
2. Microsoft Bing, Bothell, WA, USA
3. Lehigh University, Bethlehem, PA, USA
Abstract
IP geolocation databases map IP addresses to their physical locations. They are used to determine the location of online users when their precise location is unavailable. These databases are vital for a number of online services, including search engine personalization, content delivery, local ads, and fraud detection. However, IP geolocation databases are often inaccurate. In this work we present two novel approaches to improving IP geolocation by mining search engine click logs. First, we show that we can derive which URLs have local affinity by clustering clicks from IPs with known locations. We demonstrate that we can further propagate these URL locations to IP addresses with unknown locations. Our approach significantly outperforms two state-of-the-art commercial IP geolocation databases by 25 and 36 percentage points at a distance error of 10 kilometers, respectively. Second, we present an alternative method of assigning locations to URLs when IP location training data is not available, by instead extracting locations from the body of web documents. This second approach also outperforms the baselines by 7 and 17 percentage points, respectively, and has higher coverage than the first method. Finally, we also demonstrate that our two approaches outperform the academic state of the art based on mining query logs.
Publisher
Association for Computing Machinery (ACM)
Subject
Discrete Mathematics and Combinatorics,Geometry and Topology,Computer Science Applications,Modeling and Simulation,Information Systems,Signal Processing
Reference59 articles.
1. Web-a-where
2. Leonardo Andrade and Mário J. Silva. 2006. Relevance ranking for geographic IR. In Workshop on Geographic Information Retrieval (GIR’06), colocated with SIGIR.
3. Find me if you can
4. Inferring and using location metadata to personalize web search
5. City Size Distributions and Economic Development
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献