Affiliation:
1. Department of Public Health Sciences University of Chicago Chicago, IL
Abstract
The log-rank test is perhaps the most commonly used nonparametric method for comparing two survival curves and yields maximum power under proportional hazards alternatives. While the assumption of proportional hazards is often reasonable, it need not hold. Several authors have therefore developed versatile tests using combinations of weighted log-rank statistics that are more sensitive to nonproportional hazards. Fleming and Harrington (1991, Counting Processes and Survival Analysis, Wiley) consider the family of Gρ statistics and their supremum versions, while Lee (1996, Biometrics 52: 721-725) and Lee (2007, Computational Statistics and Data Analysis 51: 6557-6564) propose tests based on the more extended Gρ,γ family. In this article, I consider Zm = max(| Z1 |, |Z2|, | Z3|), where Z1, Z2, and Z3 are z statistics obtained from G0,0, G1,0, and G0,1 tests, respectively. G0,0 corresponds to the log-rank test, while G1,0 and G0,1 are more sensitive to early and late-difference alternatives. I conduct a simulation study to compare the performance of Zm with the log-rank test, the more optimally weighted test, and Lee's (2007) tests, under the null hypothesis, proportional hazards, early difference, and late-difference alternatives. Results indicate that the method based on Zm maintains the type I error rate, provides increased power relative to the log-rank test under early difference and late-difference alternatives, and entails only a small to moderate power loss compared with the more optimally chosen test. I apply the procedure to two datasets reported in the literature, both of which exhibit nonproportional hazards. Versatile tests such as Zm may be useful in clinical trial settings where there is concern that the treatment effect may not conform to the proportional hazards assumption. I also describe the syntax for a Stata command, verswlr, to implement the method.
Subject
Mathematics (miscellaneous)