BioPharmNet
Topics
BioPharmNet forum
 
Navigation
This page is maintained by the Multiplicity Advisory Board (Alex Dmitrienko).
Multiple tests based on marginal p-values
Multiple tests based on marginal p-values can be thought of as "distribution-free" tests because they rely on elementary probability inequalities and thus do not depend on the joint distribution of test statistics. These multiple testing procedures are intuitive, easy to apply and widely used in pharmaceutical applications. Since these multiple tests rely on the marginal distribution of individual test statistics and ignore the underlying correlation structure, multiple tests that make full use of the joint distribution of test statistics (for example, parametric multiple tests) outperform marginal tests. Multiple tests based on marginal p-values lead to power loss when the test statistics are highly correlated or the number of multiple analyses is large.
Papers and books
Dmitrienko, A., Molenberghs, G., Chuang-Stein, C., Offen, W. (2005). Analysis of Clinical Trials Using SAS: A Practical Guide. SAS Press: Cary, NC.
Chapter 2 of this book gave an overview of multiple tests based on marginal p-values (Sections 2.2 and 2.3), as well as other multiple testing procedures, with case studies and numerical examples based on SAS.
Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics. 6, 65-70.
A step-down testing procedure (testing begins with the most significant p-value) derived from the Bonferroni test. The Holm test is uniformly more powerful than the Bonferroni test and controls the familywise error rate in the strong sense for any distribution of the individual test statistics. The paper describes both regular and weighted versions of this test.
Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple significance testing. Biometrika. 75, 800-802.
A step-up testing procedure (testing begins with the least significant p-value) derived from the Simes test (Simes, 1986). This test is uniformly more powerful than the Bonferroni and Holm tests. It controls the familywise error rate in the strong sense when the Simes test protects the Type I error rate, for example, in the case of independent or positively dependent test statistics (see Sarkar and Chang, 1997; Sarkar, 1998).
Hochberg, Y., Tamhane, A.C. (1987). Multiple Comparison Procedures. Wiley, New York.
A great introduction to multiple testing procedures.
Hommel, G. (1988). A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika. 75, 383-386.
Hommel, G. (1989). A comparison of two modified Bonferroni procedures. Biometrika. 76, 624-625.
A closed testing procedure derived based on the Simes test (Simes, 1986). Unlike the Hocberg test (Hocberg, 1988), the Hommel test does not have a stepwise form. The Hommel test is uniformly more powerful than the Bonferroni, Holm and Hochberg tests and controls the familywise error rate in the strong sense when the Simes test protects the Type I error rate, that is, when the test statistics are independent or positively dependent (see Sarkar and Chang, 1997; Sarkar, 1998).
Maurer, W., Hothorn, L., Lehmacher, W. (1995). Multiple comparisons in drug clinical trials and preclinical assays: a-priori ordered hypotheses. Biometrie in der chemisch-pharmazeutischen Industrie. Vollmar, J. (editor). Fischer Verlag: Stuttgart, Vol. 6, 3-18.
This paper described applications of the fixed-sequence test to multiple comparison problems in clinical trials. The fixed-sequence test assumes that null hypotheses of interest are a priori ordered. Testing is carried out sequentially at an unadjusted level as long as all preceding null hypotheses were rejected. Westfall and Krishen (2001) give a detailed theoretical description of the fixed-sequence testing approach.
Simes, R.J. (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika. 63, 655-660.
The Simes test was designed for testing the global null hypothesis (intersection of null hypotheses) and cannot be used to perform inferences for individual null hypotheses. The test controls the Type I error rate when the test statistics are independent or positively dependent (see Sarkar and Chang, 1997; Sarkar, 1998). Properties of the Simes test have been studied by many authors; see, for example, Somerville, Wilson, Koch and Westfall (2005).
Sarkar, S., Chang, C.K. (1997). Simes' method for multiple hypothesis testing with positively dependent test statistics. Journal of the American Statistical Association. 92, 1601-1608.
Sarkar, S.K. (1998). Some probability inequalities for censored MTP2 random variables: A proof of the Simes conjecture. The Annals of Statistics. 26, 494-504.
This paper studied the Type I error rate of the Simes test and showed that this global test preserves the Type I error rate under the assumption of positive dependence.
Somerville, M., Wilson, T., Koch, G., Westfall, P. (2005). Evaluation of a weighted multiple comparison procedure. Pharmaceutical Statistics. 4, 7-13.
This paper examined the Type I error rate of the Simes test for unequally weighted hypotheses.
Westfall, P.H., Krishen, A. (2001). Optimally weighted, fixed sequence, and gatekeeping multiple testing procedures. Journal of Statistical Planning and Inference. 99, 25-40.
The paper described various properties of the fixed-sequence testing approach.
Westfall, P.H., Tobias, R.D., Rom, D., Wolfinger, R.D., Hochberg,Y. (1999). Multiple Comparisons and Multiple Tests Using the SAS System. SAS Press: Cary, NC.
This book covered a broad range of topics in multiple testing, including multiple tests based on marginal p-values (Chapter 2).
Wiens, B. (2003). A fixed-sequence Bonferroni procedure for testing multiple endpoints. Pharmaceutical Statistics. 2, 211-215.
Wiens, B., Dmitrienko, A. (2005). The fallback procedure for evaluating a single family of hypotheses. Journal of Biopharmaceutical Statistics. 15, 929-942.
These papers introduced the fallback test, a multiple test that extends the fixed-sequence testing approach (Maurer, Hothorn and Lehmacher, 1995; Westfall and Krishen, 2001). Unlike the fixed-sequence test, the fallback test can continue testing even if it encounters a non-significant outcome. This test controls the familywise error rate in the strong sense for any distribution of the individual test statistics.