Number of people required for usability evaluation-Reference-Cited by-同舟云学术

Number of people required for usability evaluation

Published:2010-05 Issue:5 Volume:53 Page:130-133
ISSN:0001-0782
Container-title:Communications of the ACM
language:en
Short-container-title:Commun. ACM

Author:

Hwang Wonil¹,Salvendy Gavriel²

Affiliation:

1. Soongsil University in Seoul, Korea

2. Purdue University in West Lafayette, Indiana and Tsinghua University in Beijing, P.R. China

Abstract

Introduction Usability evaluation is essential to make sure that software products newly released are easy to use, efficient, and effective to reach goals, and satisfactory to users. For example, when a software company wants to develop and sell a new product, the company needs to evaluate usability of the new product before launching it at a market to avoid the possibility that the new product may contain usability problems, which span from cosmetic problems to severe functional problems. Three widely used methods for usability evaluation are Think Aloud (TA), Heuristic Evaluation (HE) and Cognitive Walkthrough (CW). TA method is commonly employed with a lab-based user testing, while there are variants of TA methods, including thinking out aloud at user's workplace instead of at labs. What we discuss here is the TA method that is combined with a lab-based user testing, in which test users use products while simultaneously and continuously thinking out aloud, and experimenters record users' behaviors and verbal protocols in the laboratory. HE is a usability inspection method, in which a small number of evaluators find usability problems in a user interface design by examining an interface and judging its compliance with well-known usability principles, called heuristics. CW is a theory-based method, in which evaluators evaluate every step necessary to perform a scenario-based task, and look for usability problems that would interfere with learning by exploration. These three methods have their own advantages and disadvantages. For instance, TA method provides good qualitative data from a small number of test users, but laboratory environment may influence test user's behaviors. HE is a cheap, fast and easy-to-use method, while it often finds too specific and low-priority usability problems, including even not real problems. CW helps find mismatches between users' and designers' conceptualization of a task, but it needs extensive knowledge of cognitive psychology and technical details to apply. However, even though these advantages and disadvantages show overall characteristics of three major usability evaluation methods, we cannot compare them quantitatively and see their efficiency clearly. Because one of reasons why so-called discounted methods, such as HE and CW, were developed is to save costs of usability evaluation, cost-related criteria for comparing usability evaluation are meaningful to usability practitioners as well as usability researchers. One of the most disputable issues related to cost of usability evaluation is sample size. That is, how many users or evaluators are needed to achieve a targeted usability evaluation performance, for example, 80% of overall discovery rate? The sample size of usability evaluation is known to depend on an estimate of problem discovery rate across participants. The overall discovery rate is a common quantitative measure that is used to show the effectiveness of a specific usability evaluation method in most of usability evaluation studies. It is also called overall detection rate or thoroughness measure, which is the ratio of 'the sum of unique usability problems detected by all experiment participants' against 'the number of usability problems that exist in the evaluated systems', ranging between 0 and 1. The overall discovery rates were reported more than any other criterion measure in the usability evaluation experiments and also a key component for projecting required sample size for usability evaluation study. Thus, how many test users or evaluators participate in the usability evaluation is a critical issue, considering its cost-effectiveness.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/1735223.1735255

Reference11 articles.

1. Relaxing the homogeneity assumption in usability testing. Behaviour &;Caulton D. A;Information Technology,2001

2. The evaluator effect during first-time use of the CW technique. In H.-J. Bullinger & J. Ziegler (Eds.), Human--Computer Interaction: Ergonomics and User Interfaces (Vol. 1.), London: Lawrence Erlbaum Associates;Hertzum M.;Inc.,1999

3. Complementarity and convergence of heuristic evaluation and usability test

4. Analysis of combinatorial user effect in international usability tests

Cited by 252 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Evaluating a Mobile App Supporting Evidence-Based Parenting Skills: Thematic Analysis of Parent Experience;JMIR Pediatrics and Parenting;2024-09-05

2. What instruments do researchers use to evaluate LXD? A systematic review study;Technology, Knowledge and Learning;2024-07-18

3. Artificial Haptic Cues as Assistance for Simplified Vehicle Operations in Advanced Air Mobility;2024 IEEE Intelligent Vehicles Symposium (IV);2024-06-02

4. Simulator-based Mixed Reality eVTOL Pilot Training: The Instructor Operator Station;Proceedings of the CHI Conference on Human Factors in Computing Systems;2024-05-11

5. Empowering Potential of the My Assistive Technology Guide: Exploring Experiences and User Perspectives;Disabilities;2024-04-19