TPCx-AI - An Industry Standard Benchmark for Artificial Intelligence and Machine Learning Systems-Reference-Cited by-同舟云学术

TPCx-AI - An Industry Standard Benchmark for Artificial Intelligence and Machine Learning Systems

Published:2023-08 Issue:12 Volume:16 Page:3649-3661
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Brücke Christoph¹,Härtling Philipp¹,Palacios Rodrigo D Escobar²,Patel Hamesh²,Rabl Tilmann³

Affiliation:

1. bankmark, Germany

2. Intel, Hillsboro, Oregon

3. Hasso Plattner Institute, University of Potsdam, bankmark, Germany

Abstract

Artificial intelligence (AI) and machine learning (ML) techniques have existed for years, but new hardware trends and advances in model training and inference have radically improved their performance. With an ever increasing amount of algorithms, systems, and hardware solutions, it is challenging to identify good deployments even for experts. Researchers and industry experts have observed this challenge and have created several benchmark suites for AI and ML applications and systems. While they are helpful in comparing several aspects of AI applications, none of the existing benchmarks measures end-to-end performance of ML deployments. Many have been rigorously developed in collaboration between academia and industry, but no existing benchmark is standardized. In this paper, we introduce the TPC Express Benchmark for Artificial Intelligence (TPCx-AI), the first industry standard benchmark for end-to-end machine learning deployments. TPCx-AI is the first AI benchmark that represents the pipelines typically found in common ML and AI workloads. TPCx-AI provides a full software kit, which includes data generator, driver, and two full workload implementations, one based on Python libraries and one based on Apache Spark. We describe the complete benchmark and show benchmark results for various scale factors. TPCx-AI's core contributions are a novel unified data set covering structured and unstructured data; a fully scalable data generator that can generate realistic data from GB up to PB scale; and a diverse and representative workload using different data types and algorithms, covering a wide range of aspects of real ML workloads such as data integration, data processing, training, and inference.

Publisher

Association for Computing Machinery (ACM)

Subject

General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development

Link

https://dl.acm.org/doi/pdf/10.14778/3611540.3611554

Reference37 articles.

1. TFX

2. Cody Coleman , Daniel Kang , Deepak Narayanan , Luigi Nardi , Tian Zhao , Jian Zhang , Peter Bailis , Kunle Olukotun , Christopher Ré , and Matei Zaharia . 2018. Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark. CoRR abs/1806.01427 ( 2018 ). Cody Coleman, Daniel Kang, Deepak Narayanan, Luigi Nardi, Tian Zhao, Jian Zhang, Peter Bailis, Kunle Olukotun, Christopher Ré, and Matei Zaharia. 2018. Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark. CoRR abs/1806.01427 (2018).

3. Transaction Processing Performance Council. 2022. TPCx-AI. https://tpc.org/tpcx-ai/default5.asp Transaction Processing Performance Council. 2022. TPCx-AI. https://tpc.org/tpcx-ai/default5.asp

4. ImageNet: A large-scale hierarchical image database

5. Christopher Elford , Dippy Aggarwal , and Shreyas Shekhar . 2021. Revisiting Issues in Benchmark Metric Selection . In Performance Evaluation and Benchmarking, Raghunath Nambiar and Meikel Poess (Eds.). Springer International Publishing , Cham , 35--47. Christopher Elford, Dippy Aggarwal, and Shreyas Shekhar. 2021. Revisiting Issues in Benchmark Metric Selection. In Performance Evaluation and Benchmarking, Raghunath Nambiar and Meikel Poess (Eds.). Springer International Publishing, Cham, 35--47.

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. IMBridge: Impedance Mismatch Mitigation between Database Engine and Prediction Query Execution;Companion of the 2024 International Conference on Management of Data;2024-06-09

2. The Hopsworks Feature Store for Machine Learning;Companion of the 2024 International Conference on Management of Data;2024-06-09

3. Surprise Benchmarking: The Why, What, and How;Proceedings of the Tenth International Workshop on Testing Database Systems;2024-06-09

4. Xorbits: Automating Operator Tiling for Distributed Data Science;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

5. Optimizing Data Pipelines for Machine Learning in Feature Stores;Proceedings of the VLDB Endowment;2023-09