Affiliation:
1. University of Turku, Finland
Abstract
This chapter considers parallel implementation of the online multi-label regularized least-squares machine-learning algorithm for embedded hardware platforms. The authors focus on the following properties required in real-time adaptive systems: learning in online fashion, that is, the model improves with new data but does not require storing it; the method can fully utilize the computational abilities of modern embedded multi-core computer architectures; and the system efficiently learns to predict several labels simultaneously. They demonstrate on a hand-written digit recognition task that the online algorithm converges faster, with respect to the amount of training data processed, to an accurate solution than a stochastic gradient descent based baseline. Further, the authors show that our parallelization of the method scales well on a quad-core platform. Moreover, since Network-on-Chip (NoC) has been proposed as a promising candidate for future multi-core architectures, they implement a NoC system consisting of 16 cores. The proposed machine learning algorithm is evaluated in the NoC platform. Experimental results show that, by optimizing the cache behaviour of the program, cache/memory efficiency can improve significantly. Results from the chapter provide a guideline for designing future embedded multi-core machine learning devices.
Reference31 articles.
1. Bottou, L. (2010). Large-Scale Machine Learning with Stochastic Gradient Descent. In Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT'2010), (pp. 177–187). Paris, France: Springer.
2. Large scale online learning.;L.Bottou;Advances in Neural Information Processing Systems,2004
3. Map-reduce for machine learning on multicore.;C.-T.Chu;Advances in Neural Information Processing Systems,2007
4. Dally, W. J., & Towles, B. (2001). Route packets, not wires: on-chip inteconnection networks. In Proceedings of the 38th conference on Design automation, (pp. 684–689). Academic Press.
5. Do, T.-N., Nguyen, V.-H., & Poulet, F. (2008). Speed up SVM algorithm for massive classification tasks. In Proceedings of the 4th International Conference on Advanced Data Mining and Applications (ADMA 2008) (LNCS), (vol. 5139, pp. 147–157). Springer.