Parallel Generalized Hebbian Algorithm for Large Scale Data Analytics
-
Published:2021-03-22
Issue:
Volume:
Page:14-21
-
ISSN:
-
Container-title:Mesopotamian Journal of Big Data
-
language:
-
Short-container-title:MJBD
Author:
Yaseen Mohanad G.1ORCID, Naeemullah Mohammad2, Mansoor Ibarhim Adeb3
Affiliation:
1. Department of Computer, College of Education, AL-Iraqia University, Baghdad, Iraq. 2. Department of Computer Science Maulana Azad College. Rauza Bagh Aurangabad. Maharashtra, India 3. Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, Malaysia
Abstract
In order to store and analyse large amounts of data on a parallel cluster, Big Data Systems such as Hadoop and DBMSs require a complex configuration and tuning procedure. This is mostly the result of static partitioning occurring whenever data sets are imported into the file system or transferred into it. Following that, parallel processing is carried out in a distributed fashion, with the objective of achieving balanced parallel execution among nodes. The system is notoriously difficult to configure, particularly in the areas of node synchronisation, data redistribution, and distributed caching in main memory. The extended Hebbian algorithm, abbreviated as GHA, is a linear feedforward neural network model for unsupervised learning that finds the majority of its applications in principle components analysis. Sanger's rule is another name for the GHA that may be found in the academic literature. Its formulation and stability, with the additional feature that it may be used to networks that have more than one output. A unique hardware architecture for principal component analysis is presented here in the form of a paper. The Generalized Hebbian Algorithm (GHA) was chosen as the foundation for the design because to the fact that it is both straightforward and efficient. The architecture may be broken down into three distinct parts: the memory unit, the weight vector updating unit, and the primary computing unit. Within the weight vector updating unit, the computation of various synaptic weight vectors uses the same circuit in order to cut down on the area expenses. This is done in order to save space. The GHA architecture incorporates a versatile multi-computer framework that is based on mpi. Therefore, GHA may be efficiently executed on platforms that utilise either sequential processing or parallel processing. When the data set is studied for a short period of time or when a dynamic number of virtual processors is selected at runtime, we predict that our architecture will be able to profit from parallel processing on the cloud. In this research, a parallel implementation of a variety of machine learning algorithms that are built on top of the MapReduce paradigm is presented with the purpose of improving processing speed and saving time.
Publisher
Mesopotamian Academic Press
Subject
Library and Information Sciences,Health Informatics,Education,Medicine (miscellaneous),General Medicine,General Medicine,General Psychology,Biomedical Engineering,General Medicine,Bioengineering,General Medicine,Education,General Medicine,Psychiatry and Mental health,Health Policy,General Medicine,General Medicine
Reference1 articles.
1. [1]A. K. Tripathi, K. Sharma, M. Bala, A. Kumar, V. G. Menon, and A. K. Bashir, "A parallel military-dog-based algorithm for clustering big data in cognitive industrial internet of things," IEEE Transactions on Industrial Informatics, vol. 17, no. 3, pp. 2134-2142, 2020.[2]L. Yao and Z. Ge, "Big data quality prediction in the process industry: A distributed parallel modeling framework," Journal of Process Control, vol. 68, pp. 1-13, 2018.[3]I. Gemp, B. McWilliams, C. Vernade, and T. Graepel, "Eigengame unloaded: When playing games is better than optimizing," arXiv preprint arXiv:2102.04152, 2021.[4]A. H. Ali, "A survey on vertical and horizontal scaling platforms for big data analytics," International Journal of Integrated Engineering, vol. 11, no. 6, pp. 138-150, 2019.[5]A. H. Ali and M. Z. Abdullah, "Recent trends in distributed online stream processing platform for big data: Survey," in 2018 1st Annual International Conference on Information andSciences (AiCIS), 2018, pp. 140-145: IEEE.[6]J. R. Torres-Castillo, C. O. López-López, and M. A. Padilla-Castañeda, "Neuromuscular disorders detection through time-frequency analysis and classification of multi-muscular EMG signals using Hilbert-Huang transform," Biomedical Signal Processing and Control, vol. 71, p. 103037, 2022.[7]R. Talib, "How we can use Energy Efficiency built upon the method of K-means clustering to extend the lifetime of WSN," Al-Salam Journal for Engineering and Technology, vol.2, no. 1, pp. 40-45, 2023.[8]Y. Li and D. Zhang, "Hadoop-Based University Ideological and Political Big Data Platform Design and Behavior Pattern Mining," in 2020 International Conference on Advance in Ambient Computing and Intelligence (ICAACI), 2020, pp. 47-51: IEEE.[9]A. H. Ali, M. Aljanabi, and M. A. Ahmed, "Fuzzy generalized Hebbian algorithm for large-scale intrusion detection system," International Journal of Integrated Engineering, vol. 12, no. 1, pp. 81-90, 2020.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|