1. [n.d.]. Common Crawl. http://commoncrawl.org/. (Accessed on 10/22/2021). [n.d.]. Common Crawl. http://commoncrawl.org/. (Accessed on 10/22/2021).
2. [n.d.]. Download | OpenWebTextCorpus. https://skylion007.github.io/OpenWebTextCorpus/. (Accessed on 10/22/2021). [n.d.]. Download | OpenWebTextCorpus. https://skylion007.github.io/OpenWebTextCorpus/. (Accessed on 10/22/2021).
3. [n.d.]. News Dataset Available - Common Crawl. https://commoncrawl.org/2016/10/news-dataset-available/. (Accessed on 10/22/2021). [n.d.]. News Dataset Available - Common Crawl. https://commoncrawl.org/2016/10/news-dataset-available/. (Accessed on 10/22/2021).
4. Observatory of trends in software related microblogs
5. Peter F Brown , Vincent J Della Pietra , Peter V Desouza, Jennifer C Lai, and Robert L Mercer. 1992 . Class-based n-gram models of natural language. Computational linguistics 18, 4 (1992), 467--480. Peter F Brown, Vincent J Della Pietra, Peter V Desouza, Jennifer C Lai, and Robert L Mercer. 1992. Class-based n-gram models of natural language. Computational linguistics 18, 4 (1992), 467--480.