Affiliation:
1. IBM Research -- Almaden
Abstract
We present a new algorithm for inferring the home location of Twitter users at different granularities, including city, state, time zone, or geographic region, using the content of users’ tweets and their tweeting behavior. Unlike existing approaches, our algorithm uses an ensemble of statistical and heuristic classifiers to predict locations and makes use of a geographic gazetteer dictionary to identify place-name entities. We find that a hierarchical classification approach, where time zone, state, or geographic region is predicted first and city is predicted next, can improve prediction accuracy. We have also analyzed movement variations of Twitter users, built a classifier to predict whether a user was travelling in a certain period of time, and use that to further improve the location detection accuracy. Experimental evidence suggests that our algorithm works well in practice and outperforms the best existing algorithms for predicting the home location of Twitter users.
Publisher
Association for Computing Machinery (ACM)
Subject
Artificial Intelligence,Theoretical Computer Science
Cited by
112 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献