Zeno: An Interactive Framework for Behavioral Evaluation of Machine Learning

Author:

Cabrera Ángel Alexander1ORCID,Fu Erica2ORCID,Bertucci Donald3ORCID,Holstein Kenneth1ORCID,Talwalkar Ameet4ORCID,Hong Jason I.1ORCID,Perer Adam1ORCID

Affiliation:

1. Human-Computer Interaction Institute, Carnegie Mellon University, United States

2. Carnegie Mellon University, United States

3. School of Electrical Engineering and Computer Science, Oregon State University, United States

4. Machine Learning Department, Carnegie Mellon University, United States

Funder

National Science Foundation

Publisher

ACM

Reference61 articles.

1. FactSheets: Increasing trust in AI services through supplier's declarations of conformity

2. Josh Attenberg Panagiotis G. Ipeirotis and Foster Provost. 2011. Beat the Machine: Challenging Workers to Find the Unknown Unknowns. Josh Attenberg Panagiotis G. Ipeirotis and Foster Provost. 2011. Beat the Machine: Challenging Workers to Find the Unknown Unknowns.

3. Big Data’s Disparate Impact;Barocas Solon;SSRN Electronic Journal,2018

4. Donald Bertucci , Md Montaser Hamid , Yashwanthi Anand , Anita Ruangrotsakun , Delyar Tabatabai , Melissa Perez , and Minsuk Kahng . 2022. DendroMap: Visual Exploration of Large-Scale Image Datasets for Machine Learning with Treemaps . IEEE Transactions on Visualization and Computer Graphics (TVCG) ( 2022 ). https://div-lab.github.io/dendromap/ Publisher : IEEE. Donald Bertucci, Md Montaser Hamid, Yashwanthi Anand, Anita Ruangrotsakun, Delyar Tabatabai, Melissa Perez, and Minsuk Kahng. 2022. DendroMap: Visual Exploration of Large-Scale Image Datasets for Machine Learning with Treemaps. IEEE Transactions on Visualization and Computer Graphics (TVCG) (2022). https://div-lab.github.io/dendromap/ Publisher: IEEE.

5. Rishi Bommasani Drew A. Hudson Ehsan Adeli Russ Altman Simran Arora Sydney von Arx Michael S. Bernstein Jeannette Bohg Antoine Bosselut Emma Brunskill Erik Brynjolfsson Shyamal Buch Dallas Card Rodrigo Castellon Niladri Chatterji Annie Chen Kathleen Creel Jared Quincy Davis Dora Demszky Chris Donahue Moussa Doumbouya Esin Durmus Stefano Ermon John Etchemendy Kawin Ethayarajh Li Fei-Fei Chelsea Finn Trevor Gale Lauren Gillespie Karan Goel Noah Goodman Shelby Grossman Neel Guha Tatsunori Hashimoto Peter Henderson John Hewitt Daniel E. Ho Jenny Hong Kyle Hsu Jing Huang Thomas Icard Saahil Jain Dan Jurafsky Pratyusha Kalluri Siddharth Karamcheti Geoff Keeling Fereshte Khani Omar Khattab Pang Wei Koh Mark Krass Ranjay Krishna Rohith Kuditipudi Ananya Kumar Faisal Ladhak Mina Lee Tony Lee Jure Leskovec Isabelle Levent Xiang Lisa Li Xuechen Li Tengyu Ma Ali Malik Christopher D. Manning Suvir Mirchandani Eric Mitchell Zanele Munyikwa Suraj Nair Avanika Narayan Deepak Narayanan Ben Newman Allen Nie Juan Carlos Niebles Hamed Nilforoshan Julian Nyarko Giray Ogut Laurel Orr Isabel Papadimitriou Joon Sung Park Chris Piech Eva Portelance Christopher Potts Aditi Raghunathan Rob Reich Hongyu Ren Frieda Rong Yusuf Roohani Camilo Ruiz Jack Ryan Christopher Ré Dorsa Sadigh Shiori Sagawa Keshav Santhanam Andy Shih Krishnan Srinivasan Alex Tamkin Rohan Taori Armin W. Thomas Florian Tramèr Rose E. Wang William Wang Bohan Wu Jiajun Wu Yuhuai Wu Sang Michael Xie Michihiro Yasunaga Jiaxuan You Matei Zaharia Michael Zhang Tianyi Zhang Xikun Zhang Yuhui Zhang Lucia Zheng Kaitlyn Zhou and Percy Liang. 2022. On the Opportunities and Risks of Foundation Models. http://arxiv.org/abs/2108.07258 arXiv:2108.07258 [cs]. Rishi Bommasani Drew A. Hudson Ehsan Adeli Russ Altman Simran Arora Sydney von Arx Michael S. Bernstein Jeannette Bohg Antoine Bosselut Emma Brunskill Erik Brynjolfsson Shyamal Buch Dallas Card Rodrigo Castellon Niladri Chatterji Annie Chen Kathleen Creel Jared Quincy Davis Dora Demszky Chris Donahue Moussa Doumbouya Esin Durmus Stefano Ermon John Etchemendy Kawin Ethayarajh Li Fei-Fei Chelsea Finn Trevor Gale Lauren Gillespie Karan Goel Noah Goodman Shelby Grossman Neel Guha Tatsunori Hashimoto Peter Henderson John Hewitt Daniel E. Ho Jenny Hong Kyle Hsu Jing Huang Thomas Icard Saahil Jain Dan Jurafsky Pratyusha Kalluri Siddharth Karamcheti Geoff Keeling Fereshte Khani Omar Khattab Pang Wei Koh Mark Krass Ranjay Krishna Rohith Kuditipudi Ananya Kumar Faisal Ladhak Mina Lee Tony Lee Jure Leskovec Isabelle Levent Xiang Lisa Li Xuechen Li Tengyu Ma Ali Malik Christopher D. Manning Suvir Mirchandani Eric Mitchell Zanele Munyikwa Suraj Nair Avanika Narayan Deepak Narayanan Ben Newman Allen Nie Juan Carlos Niebles Hamed Nilforoshan Julian Nyarko Giray Ogut Laurel Orr Isabel Papadimitriou Joon Sung Park Chris Piech Eva Portelance Christopher Potts Aditi Raghunathan Rob Reich Hongyu Ren Frieda Rong Yusuf Roohani Camilo Ruiz Jack Ryan Christopher Ré Dorsa Sadigh Shiori Sagawa Keshav Santhanam Andy Shih Krishnan Srinivasan Alex Tamkin Rohan Taori Armin W. Thomas Florian Tramèr Rose E. Wang William Wang Bohan Wu Jiajun Wu Yuhuai Wu Sang Michael Xie Michihiro Yasunaga Jiaxuan You Matei Zaharia Michael Zhang Tianyi Zhang Xikun Zhang Yuhui Zhang Lucia Zheng Kaitlyn Zhou and Percy Liang. 2022. On the Opportunities and Risks of Foundation Models. http://arxiv.org/abs/2108.07258 arXiv:2108.07258 [cs].

Cited by 13 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Transparency in the Wild: Navigating Transparency in a Deployed AI System to Broaden Need-Finding Approaches;The 2024 ACM Conference on Fairness, Accountability, and Transparency;2024-06-03

2. Human-Centered Evaluation and Auditing of Language Models;Extended Abstracts of the CHI Conference on Human Factors in Computing Systems;2024-05-11

3. JupyterLab in Retrograde: Contextual Notifications That Highlight Fairness and Bias Issues for Data Scientists;Proceedings of the CHI Conference on Human Factors in Computing Systems;2024-05-11

4. Talaria: Interactively Optimizing Machine Learning Models for Efficient Inference;Proceedings of the CHI Conference on Human Factors in Computing Systems;2024-05-11

5. EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria;Proceedings of the CHI Conference on Human Factors in Computing Systems;2024-05-11

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3