Navigation
This page is maintained by the Multiplicity
Advisory Board ( Alex
Dmitrienko).
Multiple tests based on marginal p-values
Multiple tests based on marginal p-values
can be thought of as "distribution-free" tests because they rely
on elementary probability inequalities and thus do not depend
on the joint distribution of test statistics. These multiple testing
procedures are intuitive, easy to apply and widely used in pharmaceutical
applications. Since these multiple tests rely on the marginal
distribution of individual test statistics and ignore the underlying
correlation structure, multiple tests that make full use of the
joint distribution of test statistics (for example, parametric
multiple tests) outperform marginal tests. Multiple tests based
on marginal p-values lead to power loss when the test statistics
are highly correlated or the number of multiple analyses is large.
Papers and books
Dmitrienko, A., Molenberghs, G., Chuang-Stein,
C., Offen, W. (2005).
Analysis of Clinical Trials Using SAS: A Practical Guide.
SAS Press: Cary, NC.
Chapter 2 of this book gave an overview
of multiple tests based on marginal p-values (Sections 2.2 and
2.3), as well as other multiple testing procedures, with case
studies and numerical examples based on SAS.
Holm, S. (1979). A simple sequentially rejective
multiple test procedure. Scandinavian Journal of Statistics. 6,
65-70.
A step-down testing procedure (testing begins
with the most significant p-value) derived from the Bonferroni
test. The Holm test is uniformly more powerful than the Bonferroni
test and controls the familywise error rate in the strong sense
for any distribution of the individual test statistics. The paper
describes both regular and weighted versions of this test.
Hochberg, Y. (1988). A sharper Bonferroni
procedure for multiple significance testing. Biometrika. 75, 800-802.
A step-up testing procedure (testing begins
with the least significant p-value) derived from the Simes test
(Simes, 1986). This test is uniformly more powerful than the Bonferroni
and Holm tests. It controls the familywise error rate in the strong
sense when the Simes test protects the Type I error rate, for
example, in the case of independent or positively dependent test
statistics (see Sarkar and Chang, 1997; Sarkar, 1998).
Hochberg, Y., Tamhane, A.C. (1987). Multiple
Comparison Procedures. Wiley, New York.
A great introduction to multiple testing
procedures.
Hommel, G. (1988). A stagewise rejective
multiple test procedure based on a modified Bonferroni test. Biometrika.
75, 383-386.
Hommel, G. (1989). A comparison of two modified Bonferroni procedures.
Biometrika. 76, 624-625.
A closed testing procedure derived based
on the Simes test (Simes, 1986). Unlike the Hocberg test (Hocberg,
1988), the Hommel test does not have a stepwise form. The Hommel
test is uniformly more powerful than the Bonferroni, Holm and
Hochberg tests and controls the familywise error rate in the strong
sense when the Simes test protects the Type I error rate, that
is, when the test statistics are independent or positively dependent
(see Sarkar and Chang, 1997; Sarkar, 1998).
Maurer, W., Hothorn, L., Lehmacher, W. (1995).
Multiple comparisons in drug clinical trials and preclinical assays:
a-priori ordered hypotheses. Biometrie in der chemisch-pharmazeutischen
Industrie. Vollmar, J. (editor). Fischer Verlag: Stuttgart, Vol.
6, 3-18.
This paper described applications of the
fixed-sequence test to multiple comparison problems in clinical
trials. The fixed-sequence test assumes that null hypotheses of
interest are a priori ordered. Testing is carried out sequentially
at an unadjusted level as long as all preceding null hypotheses
were rejected. Westfall and Krishen (2001) give a detailed theoretical
description of the fixed-sequence testing approach.
Simes, R.J. (1986). An improved Bonferroni
procedure for multiple tests of significance. Biometrika. 63,
655-660.
The Simes test was designed for testing the global
null hypothesis (intersection of null hypotheses) and cannot be
used to perform inferences for individual null hypotheses. The
test controls the Type I error rate when the test statistics are
independent or positively dependent (see Sarkar and Chang, 1997;
Sarkar, 1998). Properties of the Simes test have been studied
by many authors; see, for example, Somerville, Wilson, Koch and
Westfall (2005).
Sarkar, S., Chang, C.K. (1997). Simes' method
for multiple hypothesis testing with positively dependent test
statistics. Journal of the American Statistical Association. 92,
1601-1608.
Sarkar, S.K. (1998). Some probability inequalities for censored
MTP2 random variables: A proof of the Simes conjecture. The Annals
of Statistics. 26, 494-504.
This paper studied the Type I error rate
of the Simes test and showed that this global test preserves the
Type I error rate under the assumption of positive dependence.
Somerville, M., Wilson, T., Koch, G., Westfall,
P. (2005). Evaluation of a weighted multiple comparison procedure.
Pharmaceutical Statistics. 4, 7-13.
This paper examined the Type I error rate
of the Simes test for unequally weighted hypotheses.
Westfall, P.H., Krishen, A. (2001). Optimally
weighted, fixed sequence, and gatekeeping multiple testing procedures.
Journal of Statistical Planning and Inference. 99, 25-40.
The paper described various properties of
the fixed-sequence testing approach.
Westfall, P.H., Tobias, R.D., Rom, D., Wolfinger,
R.D., Hochberg,Y. (1999). Multiple Comparisons and Multiple Tests
Using the SAS System. SAS Press: Cary, NC.
This book covered a broad range of topics
in multiple testing, including multiple tests based on marginal
p-values (Chapter 2).
Wiens, B. (2003). A fixed-sequence Bonferroni
procedure for testing multiple endpoints. Pharmaceutical Statistics.
2, 211-215.
Wiens, B., Dmitrienko, A. (2005). The fallback procedure for evaluating
a single family of hypotheses. Journal of Biopharmaceutical Statistics.
15, 929-942.
These papers introduced the fallback test,
a multiple test that extends the fixed-sequence testing approach
(Maurer, Hothorn and Lehmacher, 1995; Westfall and Krishen, 2001).
Unlike the fixed-sequence test, the fallback test can continue
testing even if it encounters a non-significant outcome. This test
controls the familywise error rate in the strong sense for any
distribution of the individual test statistics. |