1. The Unreasonable Effectiveness of Data;Halevy;IEEE Intelligent Systems,2009
2. Brants T, Franz A. Web 1T 5-gram Version 1. Available at http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2006T13, Linguistic Data Consortium, Philadelphia, PA, USA; 2006.
3. Brants T, Franz A. Web 1T 5-gram, 10 European Languages, Version 1. Available athttp://www.ldc.upenn.edu/Catalog/catalog-Entry.jsp?catalogId=LDC2009T25. Linguistic Data Consortium, Philadelphia, PA, USA; 2009.
4. Liu F, Yang M, Lin D. Chinese Web 5-gram Version 1. Available at http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalog-Id=LDC2010T06, Linguistic Data Consortium, Philadelphia, PA, USA; 2010.
5. Kudo T, Kazawa H. Japanese Web N-gram Version 1. Available at http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalog-Id=LDC2009T08, Linguistic Data Consortium, Philadelphia, PA, USA; 2009.