1. De Poorter, E., Troubleyn, E., Moerman, I., & Demeester, P. (2011). IDRA: A flexible system architecture for next generation wireless sensor networks. Wireless Networks, 17(6), 1423–1440.
2. Rovcanin, M., De Poorter, E., Moerman, I., & Demeester, P. A reinforcement learning based solution for cognitive network cooperation between co-located, heterogeneous wireless sensor networks. ADHoc Journal.
3. Watkins, C. J. C. H., & Dayan, P. (1992). Technical note Q-learning. Machine Learning, 8, 279–292.
4. Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3(1).
5. Bertsekas, D. P. (2010). Approximate policy iteration: A survey and some new methods. Journal of Control Theory and Applications, 9, 310–335. Report LIDS-2833.