Affiliation:
1. School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania
Abstract
Achieving high-speed network I/O on distributed-memory systems is difficult because their architecture is in general ill-suited for communication processing. Some of the common problems are: inability to do protocol processing, inefficient handling of data distribution, and poor management of the I/O. In this paper we present an I/O architecture that addresses these problems and supports high-speed network I/O on distributed-memory systems. The key to good performance is to partition the work appropriately between the system and the network interface. We perform some communication tasks on the distributed-memory parallel system since it is more powerful, and less likely to become a bottleneck than the network interface. Tasks that do not parallelize well are performed on the network interface and hardware support is provided for the most time-critical operations. We emphasize the use of simple I/O mechanisms that can be used by programming tools that map applications on the distributed-memory system to implement efficient I/O for the class of applications they support.This architecture has been implemented for the iWarp distributed-memory system. We describe this implementation and present performance results.
Publisher
Association for Computing Machinery (ACM)
Reference26 articles.
1. Design and Evaluation of primitives for Parallel I/O
2. Supporting systolic and memory communication in iWarp
3. Claudson Bornstein and Peter Steenkiste. Data Reshuffling in Support of Fast I/O for Distributed- Memory Machines. In preparation. Claudson Bornstein and Peter Steenkiste. Data Reshuffling in Support of Fast I/O for Distributed- Memory Machines. In preparation.
4. An analysis of TCP processing overhead