Affiliation:
1. Eötvös Loránd University Budapest , Hungary
2. J. Selye University Komárno , Slovakia
Abstract
Abstract
Using software applications or services, which provide word or even word pattern recommendation service has become part of our lives. Those services appear in many form in our daily basis, just think of our smartphones keyboard, or Google search suggestions and this list can be continued. With the help of these tools, we can not only find the suitable word that fits into our sentence, but we can also express ourselves in a much more nuanced, diverse way. To achieve this kind of recommendation service, we use an algorithm which is capable to recommend word by word pattern queries. Word pattern queries, can be expressed as a combination of words, part-of-speech (POS) tags and wild card words. Since there are a lot of possible patterns and sentences, we use Big Data frameworks to handle this large amount of data. In this paper, we compared two popular framework Hadoop and Spark with the proposed algorithm and recommend some enhancement to gain faster word pattern generation.
Reference18 articles.
1. [1] G. Erin. Processing time of TFIDF and Naive Bayes on Spark 2.0, Hadoop 2.6 and Hadoop 2.7: Which Tool Is More Efficient?, Msc Thesis, National College of Ireland Dublin, 2016. ⇒52
2. [2] K. Rattanaopas, S. Kaewkeeree. Improving Hadoop MapReduce performance with data compression: A study using wordcount job, 2017 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTICON). IEEE, 2017. p. 564-567 ⇒5210.1109/ECTICon.2017.8096300
3. [3] KM. Lee, CS. Han, KI. Kim, SH. Lee, Word recommendation for English composition using big corpus data processing, Cluster Computing, (2019), 1911-1924. ⇒56, 65
4. [4] M. Kontagora, H. Gonzalez-Velez, Benchmarking a MapReduce Environment on a Full Virtualisation Platform, The 4th International Conference on Complex, Intelligent and Software Intensive Systems, 433-438. 10.1109/CISIS.2010.45. ⇒62
5. [5] M. Bartík, S. Ulbik, P. Kubalik Matěj. LZ4 compression algorithm on FPGA, 2015 IEEE International Conference on Electronics, Circuits, and Systems (ICECS). IEEE, 2015 ⇒6310.1109/ICECS.2015.7440278