1. Almuallim, H., and Dietterich, T. G. 1991. Learning with many irrelevant features. In Ninth National Conference on Artificial Intelligence, 547–552. MIT Press.
2. Use of distance measures, information measures and error bounds in feature evaluation;Ben-Bassat,1982
3. Training a 3-node neural network is NP-complete;Blum;Neural Networks,1992
4. Occam's razor;Blumer;Information Processing Letters,1987
5. Optimal Subset Selection;Boyce,1974