1. Language identification from text using n-gram based cumulative frequency addition;Ahmed,2004
2. Carter Simon, Tsagkias Manos, Weerkamp Wouter. Semi-supervised priors for microblog language identification. In: Proceedings of the Dutch–Belgian information retrieval workshop (DIR-2011). Amsterdam; February 2011. http://-wouter.weerkamp.com/downloads/poster-dir2011-lid.pdf.
3. N-gram-based text categorization;Cavnar,1994
4. Gauging similarity with n-grams: language independent categorization of text;Damashek;Science,1995
5. Statistical identification of language;Dunning,1994