LLM-PBE: Assessing Data Privacy in Large Language Models-Reference-Cited by-同舟云学术

LLM-PBE: Assessing Data Privacy in Large Language Models

Published:2024-07 Issue:11 Volume:17 Page:3201-3214
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Li Qinbin¹,Hong Junyuan²,Xie Chulin³,Tan Jeffrey¹,Xin Rachel¹,Hou Junyi⁴,Yin Xavier¹,Wang Zhun¹,Hendrycks Dan⁵,Wang Zhangyang²,Li Bo⁶,He Bingsheng⁴,Song Dawn¹

Affiliation:

1. University of California, Berkeley

2. University of Texas at Austin

3. University of Illinois Urbana-Champaign

4. National University of Singapore

5. Center for AI Safety

6. University of Chicago

Abstract

Large Language Models (LLMs) have become integral to numerous domains, significantly advancing applications in data management, mining, and analysis. Their profound capabilities in processing and interpreting complex language data, however, bring to light pressing concerns regarding data privacy, especially the risk of unintentional training data leakage. Despite the critical nature of this issue, there has been no existing literature to offer a comprehensive assessment of data privacy risks in LLMs. Addressing this gap, our paper introduces LLM-PBE, a toolkit crafted specifically for the systematic evaluation of data privacy risks in LLMs. LLM-PBE is designed to analyze privacy across the entire lifecycle of LLMs, incorporating diverse attack and defense strategies, and handling various data types and metrics. Through detailed experimentation with multiple LLMs, LLM-PBE facilitates an in-depth exploration of data privacy concerns, shedding light on influential factors such as model size, data characteristics, and evolving temporal dimensions. This study not only enriches the understanding of privacy issues in LLMs but also serves as a vital resource for future research in the field. Aimed at enhancing the breadth of knowledge in this area, the findings, resources, and our full technical report are made available at https://llm-pbe.github.io/, providing an open platform for academic and practical advancements in LLM privacy assessment.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.14778/3681954.3681994

Reference91 articles.

1. 2023. https://news.ycombinator.com/item?id=34482318 Accessed: 2024-07-16.

2. 2023. Jailbreak Chat. https://www.jailbreakchat.com/ Accessed: 2024-07-16.

3. 2023. Leaked-GPTs. https://github.com/friuns2/Leaked-GPTs Accessed: 2024-07-16.

4. 2024. Hugging Face - The AI community building the future. https://huggingface.co/. Accessed: 2024-07-16.

5. 2024. Together.ai. https://www.together.ai/. Accessed: 2024-07-16.