Affiliation:
1. University of Michigan, Ann Arbor
Abstract
Finite State Machines (FSM) are widely used computation models for many application domains. These embarrassingly sequential applications with irregular memory access patterns perform poorly on conventional von-Neumann architectures. The Micron Automata Processor (AP) is an in-situ memory-based computational architecture that accelerates non-deterministic finite automata (NFA) processing in hardware. However, each FSM on the AP is processed sequentially, limiting potential speedups.
In this paper, we explore the FSM parallelization problem in the context of the AP. Extending classical parallelization techniques to NFAs executing on AP is non-trivial because of high state-transition tracking overheads and exponential computation complexity. We present the associated challenges and propose solutions that leverage both the unique properties of the NFAs (connected components, input symbol ranges, convergence, common parent states) and unique features in the AP (support for simultaneous transitions, low-overhead flow switching, state vector cache) to realize parallel NFA execution on the AP.
We evaluate our techniques against several important benchmarks including NFAs used for network intrusion detection, malware detection, text processing, protein motif searching, DNA sequencing, and data analytics. Our proposed parallelization scheme demonstrates significant speedup (25.5x on average) compared to sequential execution on AP. Prior work has already shown that sequential execution on AP is at least an order of magnitude better than GPUs, multi-core processors and Xeon Phi accelerator.
Funder
National Science Foundation
C-FAR, one of the six SRC STAR-net Centers sponsored by MARCO and DARPA
Publisher
Association for Computing Machinery (ACM)
Reference41 articles.
1. Micron Automata Processing. Retrieved May 3 2017 from http://www.micronautomata.com/ Micron Automata Processing. Retrieved May 3 2017 from http://www.micronautomata.com/
2. Micron Automata Processing D480 Documentation Design Notes. Retrieved May 3 2017 from http://www.micronautomata.com/documentation/anml_documentation/c_D480_design_notes.html Micron Automata Processing D480 Documentation Design Notes. Retrieved May 3 2017 from http://www.micronautomata.com/documentation/anml_documentation/c_D480_design_notes.html
3. Efficient string matching
4. Model checking of hierarchical state machines
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Computing;2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA);2024-03-02
2. Sunder: Enabling Low-Overhead and Scalable Near-Data Pattern Matching Acceleration;MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture;2021-10-17
3. Scalable FSM parallelization via path fusion and higher-order speculation;Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems;2021-04-17
4. Reliability Analysis for Unreliable FSM Computations;ACM Transactions on Architecture and Code Optimization;2020-06-25
5. Optimus Prime: Accelerating Data Transformation in Servers;Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems;2020-03-09