Affiliation:
1. Information Engineering College, Yangzhou Polytechnic College, Yangzhou, Jiangsu, China
2. Jiangsu Safety and Environment Technology and Equipment for Planting and Breeding Industry Engineering, Yangzhou, Jiangsu, China
Abstract
In order to solve the problem of low computing efficiency in big data analysis and model construction, this paper intended to deeply explore the big data analysis programming model, DAG (Directed Acyclic Graph) and other contents, and on this basis, it adopted a distributed matrix computing system Octopus for big data analysis. Octopus is a universal matrix programming framework that provides a programming model based on matrix operations, which can conveniently analyze and process large-scale data. By using Octopus, users can extract functions and data from multiple platforms and operate through a unified matrix operation interface. The distributed matrix representation and storage layer can design data storage formats for distributed file systems. Each computing platform in OctMatrix provides its own matrix library, and it provides a matrix library written in R language for the above users. SymboMatrix provides a matrix interface to OctMatrix that is consistent with OctMatrix. However, SymboMatrix also retains the flow diagram for matrix operations in the process, and it also supports logical and physical optimization of the flow diagram on a DAG. For the DAG computational flow graph generated by SymbolMatrix, this paper divided it into two parts: logical optimization and physical optimization. This paper adopted a distributed file system based on line matrix, and obtained the corresponding platform matrix by reading the documents based on line matrix. In the evaluation of system performance, it was found that the distributed matrix computing system had a high computing efficiency, and the average CPU (central processing unit) usage reached 70%. This system can make full use of computing resources and realize efficient parallel computing.
Reference17 articles.
1. Distributed system coordination predictive control for network information mode;Mallik;Distributed Processing System,2022
2. Compression coding distributed matrix-vector multiplication algorithm in satellite networks;Zhao;Radio Communication Technology,2021
3. Straggler mitigation in distributed matrix multiplication: Fundamental limits and optimal coding IEEE;Yu;Transactions on Information Theory,2020
4. Based on the calculation of the load distribution of ball bearings and the study of stiffness characteristics in the state of non-complete ball-raceway contact;Zhang;Journal of Mechanical Engineering,2020
5. Efficient and robust distributed matrix computations via convolutional coding IEEE;Das Anindya;Transactions on Information Theory,2021