Affiliation:
1. Department of Epidemiology & Biostatistics, College of Public Health, University of South Florida, Tampa, FL, USA
2. Biostatistics Core at Clinical and Transitional Science Institute, University of South Florida, Tampa, FL, USA
3. Health Canada, Ottawa, ON, Canada
Abstract
Background The method used to determine choice of standard deviation (SD) is inadequately reported in clinical trials. Underestimations of the population SD may result in underpowered clinical trials. Purpose This study demonstrates how using the wrong method to determine population SD can lead to inaccurate sample sizes and underpowered studies, and offers recommendations to maximize the likelihood of achieving adequate statistical power. Methods We review the practice of reporting sample size and its effect on the power of trials published in major journals. Simulated clinical trials were used to compare the effects of different methods of determining SD on power and sample size calculations. Results Prior to 1996, sample size calculations were reported in just 1%–42% of clinical trials. This proportion increased from 38% to 54% after the initial Consolidated Standards of Reporting Trials (CONSORT) was published in 1996, and from 64% to 95% after the revised CONSORT was published in 2001. Nevertheless, underpowered clinical trials are still common. Our simulated data showed that all minimal and 25th-percentile SDs fell below 44 (the population SD), regardless of sample size (from 5 to 50). For sample sizes 5 and 50, the minimum sample SDs underestimated the population SD by 90.7% and 29.3%, respectively. If only one sample was available, there was less than 50% chance that the actual power equaled or exceeded the planned power of 80% for detecting a median effect size (Cohen’s d = 0.5) when using the sample SD to calculate the sample size. The proportions of studies with actual power of at least 80% were about 95%, 90%, 85%, and 80% when we used the larger SD, 80% upper confidence limit (UCL) of SD, 70% UCL of SD, and 60% UCL of SD to calculate the sample size, respectively. When more than one sample was available, the weighted average SD resulted in about 50% of trials being underpowered; the proportion of trials with power of 80% increased from 90% to 100% when the 75th percentile and the maximum SD from 10 samples were used. Greater sample size is needed to achieve a higher proportion of studies having actual power of 80%. Limitations This study only addressed sample size calculation for continuous outcome variables. Conclusions We recommend using the 60% UCL of SD, maximum SD, 80th-percentile SD, and 75th-percentile SD to calculate sample size when 1 or 2 samples, 3 samples, 4–5 samples, and more than 5 samples of data are available, respectively. Using the sample SD or average SD to calculate sample size should be avoided.
Subject
Pharmacology,General Medicine
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献