Abstract
This paper presents an incremental k-most similar neighbor classifier, for mixed data and similarity functions that are not necessarily distances. The algorithm presented is suitable for processing large data sets, because it only stores in main memory the k most similar neighbors processed until step t, traversing only once the training data set. Several experiments with synthetic and real data are presented.
Reference26 articles.
1. A. Faragó, T. Linder, G. Lugosi. “Fast nearest-neighbor search in dissimilarity spaces”. IEEE Transactions in Pattern Analysis and Machine Intelligence. Vol. 9. 1993. pp. 957-962.
2. A. Frank, A. Asuncion. UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science. 1998.
3. C. Bohm C. Krebs. “The k-nearest neighbor join: turbo charging the kdd process”. Knowledge Information Systems. Vol. 6. 2004. pp. 728-749.
4. C. Chien, K. Bo, C. Fu. The generalized condensed nearest neighbor rule as a data reduction method. Proc. of the 18th International Conference on Pattern Recognition. Hong Kong, China. 2006. pp. 556-559.
5. C. Xia, H. Lu, BC. Ooi, J. Hu, Gorder: an efficient method for knn join processing. Proc. of the 30th international conference on very large data bases. Toronto, Canada. 2004. pp. 756-767.