Parallelisation of equation-based simulation programs on heterogeneous computing systems

Author:

Nikolić Dragan D.ORCID

Abstract

Numerical solutions of equation-based simulations require computationally intensive tasks such as evaluation of model equations, linear algebra operations and solution of systems of linear equations. The focus in this work is on parallel evaluation of model equations on shared memory systems such as general purpose processors (multi-core CPUs and manycore devices), streaming processors (Graphics Processing Units and Field Programmable Gate Arrays) and heterogeneous systems. The current approaches for evaluation of model equations are reviewed and their capabilities and shortcomings analysed. Since stream computing differs from traditional computing in that the system processes a sequential stream of elements, equations must be transformed into a data structure suitable for both types. The postfix notation expression stacks are recognised as a platform and programming language independent method to describe, store in computer memory and evaluate general systems of differential and algebraic equations of any size. Each mathematical operation and its operands are described by a specially designed data structure, and every equation is transformed into an array of these structures (a Compute Stack). Compute Stacks are evaluated by a stack machine using a Last In First Out queue. The stack machine is implemented in the DAE Tools modelling software in the C99 language using two Application Programming Interface (APIs)/frameworks for parallelism. The Open Multi-Processing (OpenMP) API is used for parallelisation on general purpose processors, and the Open Computing Language (OpenCL) framework is used for parallelisation on streaming processors and heterogeneous systems. The performance of the sequential Compute Stack approach is compared to the direct C++ implementation and to the previous approach that uses evaluation trees. The new approach is 45% slower than the C++ implementation and more than five times faster than the previous one. The OpenMP and OpenCL implementations are tested on three medium-scale models using a multi-core CPU, a discrete GPU, an integrated GPU and heterogeneous computing setups. Execution times are compared and analysed and the advantages of the OpenCL implementation running on a discrete GPU and heterogeneous systems are discussed. It is found that the evaluation of model equations using the parallel OpenCL implementation running on a discrete GPU is up to twelve times faster than the sequential version while the overall simulation speed-up gained is more than three times.

Publisher

PeerJ

Subject

General Computer Science

Reference18 articles.

1. Modeling and optimization with Optimica and JModelica.org—languages and tools for solving large-scale dynamic optimization problems;Akesson;Computers & Chemical Engineering,2010

2. PETSc users manual. Technical report ANL-95/11;Balay,2015

3. deal.II—a general purpose object oriented finite element library;Bangerth;ACM Transactions on Mathematical Software,2007

4. gPROMS—a combined discrete/continuous modelling environment for chemical processing systems;Barton;Simulation Series,1993

5. Modeling of combined discrete/continuous processes;Barton;AIChE Journal,1994

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3