Example 8: Power of Nonstandard Significance Tests in the Analysis of Variance

Traditionally, major hypothesis tests in the analysis of variance have been performed to assess whether main effects, interactions, or simple main effects exist at all. The traditional null hypothesis F-test is equivalent to a test that the RMSSE is equal to zero.

Many writers (see, for example, several of the chapters in Harlow, Mulaik, & Steiger, 1997) have expressed dissatisfaction with tests of the "nil hypothesis," i.e., tests that the effects are absolutely zero. One solution to this problem is to test hypotheses of "small effect" rather than hypotheses of zero effect. As examples of this strategy, consider the test of close fit and test of not-close fit proposed in structural modeling by MacCallum, Browne, and Sugawara (1996), or the tests of minimal effect discussed by Murphy and Myors (1998) in their recent, very accessible monograph on power analysis.

Testing hypotheses about close fit or minimal fit, compliments the interval estimation approach advocated by Steiger and Fouladi (1997), who suggested computing, and examining, a confidence interval on standardized effect size. The confidence interval approach allows one to test any hypothesis about effect size - simply examine whether the confidence interval excludes a given value. However, the width of the confidence interval also conveys information about the precision with which the data determine the size of the effects. Hence, noncentrality-based confidence interval estimates of effect size offer all the benefits of nonstandard hypothesis tests, and more. For an extended discussion of this point, with numerous examples, see Steiger & Fouladi (1997).

Power and sample size analysis in conjunction with hypotheses of minimal effect offers some important advantages when used in combination with the noncentrality interval estimation approach, because it assures, in advance, that precision of estimation will be sufficient to make the confidence interval usefully narrow.

In this exercise, we sketch our approach to tests of minimal effect in the analysis of variance, compare it to the approach advocated by Murphy and Myors (1998), and demonstrate how the calculations can be duplicated easily with the Noncentral F Probability Calculator.

Relation between Measures of Effect in ANOVA

There are several closely related measures of effect size that are employed in the context of fixed effect, factorial ANOVA designs. For notational convenience, define ∑_effect as the sum of squared effects in an ANOVA. For example in a 1-Way ANOVA,

∑_α = ∑_j^J₌₁ α _j²

(7)

Define σ_effect², the effect variance as

Σ_effect² = Σ_effect / cells_effect

(8)

where cells_effect is the number of cells involved in the effect. In a main effect, it is the number of levels in that factor. In an interaction, it is the product of the number of levels in the factors involved in the interaction.

The "signal to noise ratio" f², is defined as

ω_effect² = s_effect² / s_e²

(9)

where σ_e² is the error variance. The "proportion of variance accounted for by the effect, with other main effects and interactions partialled out," ω², is given by

w_{effect(partialled)}² = s_effect² / s_effect² + s_e²

(10)

(For simplicity of notation, we will refer to this as ω² in what follows.) Consequently, f and ω² share the very simple relationships

f ² = ω² / 1 - ω²

(11)

and

ω² = f ² / 1 + f ²

(12)

The RMSSE is defined as

RMSSE_effect = √Σ_effect / df_effectΣ_e²

(13)

Hence

RMSSE² = (cells_effect / df_effect)f²

(14)

df_effectRMSSE² = cells_effectf²

(15)

However, it is also the case that

δ_effect	= n_effectdf_effectRMSSE_effect²
	= n_effectcells_effectf_effect²	(16)
	= n_effectcells_effect(ω² / 1 - ω²)

Since RMSSE has a monotonic, functional relationship with the noncentrality parameter of the distribution of the F-statistic, so must f² and ω², because each of these quantities can be transformed monotonically into any of the others.

The implications of these results are that hypothesis tests on quantities like ω² can be re-expressed as hypotheses about the noncentrality parameter δ, and vice-versa.

Suppose, for example, we want to test the hypothesis that ω², the proportion of variance accounted for by the treatment effect, is less than or equal to .01, in a 1-way ANOVA with four groups, and a sample size of N = 25 in each group. We now address three questions concerning such a situation.

How would one perform such a test as a hypothesis test of the noncentrality parameter δ?
What would be the power of such a hypothesis test, if the actual value of ω² is 0.10?
Suppose we observe a value of 5.65 for the F-statistic in this analysis. What is the 90% confidence interval for ω²?

To answer the first question, recall that, in Example 6, we learned how to test a hypothesis about the noncentrality parameter δ. (The reader may wish to review this example briefly.) Equation 16 expresses the relation between δ and ω².

In a 1-Way ANOVA with four groups and N = 25 in each group, the degrees of freedom are 3 and 96. n_effect = 25 cells_effect = 4 and so

δ_null = 4(25)(.01/1-.01) = 1.0101

(17)

Hence, the hypothesis that ω^{2 ≤}0.01 is equivalent to the hypothesis that d ≤ 1.0101. Example 6 gives a detailed discussion of how to test this hypothesis.

To answer the second question, we convert an ω² value of 0.10 into an equivalent value of δ. We have

δ _alternative = 4(25)(.10/1-.10) = 10/.9 = 11.1111

(18)

To compute the power, we use the Noncentral F Probability Calculator. Select Power Analysis from the Statistics menu to display the Power Analysis and Interval Estimation Startup Panel. From the Startup Panel, select Probability Distributions and Noncentral F Distribution.

Now, click the OK button to display the Noncentral F Probability Calculator.

Next, compute the critical value of F for testing the hypothesis that ω^{2 ≤} 0.01. Enter 3 in the Numerator df box, 96 in the Denom. df box, and 1.0101 in the Delta box. Next, select the (1 - Cumulative p) check box and make sure the 1 - Cum. p value is .05. Choose F as the quantity to compute by clicking on the F option button under Compute. Finally, click the Compute button. The Observed F is the critical value of F needed to test the hypothesis that ω^{2 ≤}0.01.

The critical value of F is 3.5352 To compute the power of the test against the alternative that ω² = 0.10, we compute the power of the F-test when δ = 11.1111. Simply leave the Observed F value in place, and change Delta to 11.1111. Select 1 - p as the quantity to Compute, and then click the Compute button. This will compute the probability of obtaining an F-statistic greater than the Observed F when δ = 11.1111, which is the power of the test when ω² = 0.10.

We see that the power is only .649. Hence, it seems that at this sample size, the design lacks sufficient precision to discriminate between minimal and medium size effects.

To answer the final question, we first utilize the method of Example 7 to construct a 90% confidence interval for δ, then use the results of Equation 16 to convert this confidence interval into a confidence interval for ω².

Enter the 5.65 in the Observed F box, 0 in the Delta box, and then clear the (1 - Cumulative p) check box. Next, click the Compute button to compute the cumulative probability of the observed F.

The cumulative probability is above .95, so we know that the lower limit of the confidence interval will be above zero. To compute the lower limit, we solve for a value of the noncentrality parameter that will give the observed F a cumulative probability of .95. Enter .95 as Cum. p, and select Delta under Compute. Clicking the Compute button yields 4.157486. To calculate the upper confidence limit, set Cum. p to .05, and repeat the process, obtaining an upper limit of 31.54681.

These confidence limits may be converted readily into confidence limits for ω², by combining the results in Equations 16 and 12. Specifically, Equation 16 expresses f ² as a function of δ. Equation 12 expresses ω² as a function of f ². So first, we obtain a confidence interval for f ² from the endpoints of the confidence interval for δ. Specifically, using Equation 16, we have

f_effect² = δ_effect \ n_effectcells_effect

(19)

In this case, n_effect= 25 and cells_effect = 4, so to convert the confidence interval for δ to one for f ², we simply divide the endpoints by 100, yielding a 90% confidence interval from .04157486 to .3154681.

Next, we convert these endpoints using Equation 12. For the lower endpoint, we have

.04157486 / 1 + .04157486 = 0.0399

For the upper endpoint, we have

.3154681 / 1 + 3154681 = 0.2398

Note how, although the observed F has a probability level of .0013, and would be termed "highly significant" by some, the percentage of variance accounted for has not been determined with a high degree of precision. The 90% confidence interval for ω² ranges from about 4% to about 24%.

Contents

Index

Search Results

Example 8: Power of Nonstandard Significance Tests in the Analysis of Variance