Author:
Azmi Salman Dziyaul,Kusumaningrum Retno
Abstract
Background: The Rapid growth of technological developments in Indonesia had resulted in a growing amount of information. Therefore, a new information retrieval environment is necessary for finding documents that are in accordance with the user’s information needs.Objective: The purpose of this study is to uncover the differences between using Relevance Feedback (RF) with genetic algorithm and standard information retrieval systems without relevance feedback for the Indonesian language documents.Methods: The standard Information Retrieval (IR) System uses Sastrawi stemmer and Vector Space Model, while Genetic Algorithm-based (GA-based) relevance feedback uses Roulette-wheel selection and crossover recombination. The evaluation metrics are Mean Average Precision (MAP) and average recall based on user judgments.Results: By using two Indonesian language document datasets, namely abstract thesis and news dataset, the results show 15.2% and 28.6% increase in the corresponding MAP values for both datasets as opposed to the standard Information Retrieval System. A respective 7.1% and 10.5% improvement on the recall value at 10th position was also observed for both datasets. The best obtained genetic algorithm parameters for abstract thesis datasets were a population size of 20 with 0.7 crossover probability and 0.2 mutation probability, while for news dataset, the best obtained genetic algorithm parameters were a population size of 10 with 0.5 crossover probability and 0.2 mutation probability.Conclusion: Genetic Algorithm-based relevance feedback increases both values of MAP and average recall at 10th position of retrieved document. Generally, the best genetic algorithm parameters are as follows, mutation probability is 0.2, whereas the size of population size and crossover probability depends on the size of dataset and length of the query.Keywords: Genetic Algorithm, Information Retrieval, Indonesian language document, Mean Average Precision, Relevance Feedback
Reference24 articles.
1. Setiawan, W., 2017. Era Digital dan Tantangannya. Universitas Pendidikan Indonesia.
2. Lee, D. L., Chuang, H. & Kent, S., 1997. Document Ranking and the Vector Space Model. IEEE Software, 14(2), pp. 67-75.
3. Agbele, K., Adesina, A., Ekong, D. & Ayangbekun, O., 2012. State-of-the-Art Review on Relevance of Genetic Algorithm to Internet Web Search. Applied Computational Intelligence and Soft Computing, Volume 2012.
4. Pamungkas, Z. Y., Indrianti & Ridok, A., 2015. Query Ekspansion pada Sistem Temu Kembali Informasi Dokumen Berbahasa Indonesia menggunakan Pseudo Relevance Feedback (Studi kasus: Perpustakaan Universitas Brawijaya). Jurnal Mahasiswa PTIIK UB, 6(3)
5. Agusetyawan, A. W., Ridha Ahmad & Adisantoso, J., 2006. Relevance Feedback pada Temu Kembali Teks Berbahasa Indonesia dengan Metode Ide-Dec-Hi dan Ide-Regular. Jurnal Ilmiah Ilmu Komputer, 4(2).
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献