Affiliation:
1. Massachusetts Institute of Technology, USA
2. University of Pennsylvania, USA
Abstract
We present a dataflow model for modelling parallel Unix shell pipelines. To accurately capture the semantics of complex Unix pipelines, the dataflow model is order-aware, i.e., the order in which a node in the dataflow graph consumes inputs from different edges plays a central role in the semantics of the computation and therefore in the resulting parallelization. We use this model to capture the semantics of transformations that exploit data parallelism available in Unix shell computations and prove their correctness. We additionally formalize the translations from the Unix shell to the dataflow model and from the dataflow model back to a parallel shell script. We implement our model and transformations as the compiler and optimization passes of a system parallelizing shell pipelines, and use it to evaluate the speedup achieved on 47 pipelines.
Funder
Defense Advanced Research Projects Agency
National Science Foundation
Publisher
Association for Computing Machinery (ACM)
Subject
Safety, Risk, Reliability and Quality,Software
Reference65 articles.
1. A catalog of stream processing optimizations
2. The CQL continuous query language: semantic foundations and query execution
3. K. Arvind E. David Culler Robert Iannucci Vinod Kathail Keshav Pingali and Robert Thomas. 1984. The tagged token dataflow architecture. Technical report MIT Laboratory for Computer Science. K. Arvind E. David Culler Robert Iannucci Vinod Kathail Keshav Pingali and Robert Thomas. 1984. The tagged token dataflow architecture. Technical report MIT Laboratory for Computer Science.
4. Executing a program on the MIT tagged-token dataflow architecture
5. The MOSIX multicomputer operating system for high performance cluster computing
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Automatic synthesis of parallel unix commands and pipelines with KumQuat;Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming;2022-03-28
2. An empirical investigation of command-line customization;Empirical Software Engineering;2021-12-14