Nonparametrics Statistics Notes - Mann-Whitney U Test
The Mann-Whitney U test is a nonparametric alternative to the t-test for independent samples. STATISTICA expects the data to be arranged in the same way as for the t-test for independent samples. Specifically, the data file should contain a coding variable (independent variable) with at least two distinct codes that uniquely identify the group membership of each case in the data file. Select Comparing two independent samples (groups) from the Nonparametric Statistics Startup Panel - Quick tab to display the Comparing Two Groups dialog box, in which you select the coding variable and a dependent variable list (variables for which the two groups are to be compared), and the codes used in the coding variable for identifying the two groups (option Codes).
- Assumptions and interpretation
- The Mann-Whitney U test assumes that the variable under consideration was measured on at least an ordinal (rank order) scale. The interpretation of the test is essentially identical to the interpretation of the result of a t-test for independent samples, except that the U test is computed based on rank sums rather than means. The U test is the most powerful (or sensitive) nonparametric alternative to the t-test for independent samples; in fact, in some instances it may offer even greater power to reject the null hypothesis than the t-test.With samples larger than 20, the sampling distribution of the U statistic rapidly approaches the normal distribution (see Siegel, 1956). Hence, the U statistic (adjusted for ties) will be accompanied by a z value (normal distribution variate value), and the respective p-value.
- Exact probabilities for small samples
- For small to moderate sized samples, STATISTICA computes an exact probability associated with the respective U statistic. This probability is based on the enumeration of all possible values of U (unadjusted for ties), given the number of observations in the two samples (see Dinneen & Blakesley, 1973). Specifically, for small to moderate sized samples, the program will report (in the last column of the spreadsheet) the value 2 * p, where p is 1 minus the cumulative (one-sided) probability of the respective U statistic. To reiterate, the computations for this probability value are based on the assumption of no ties in the data (ranks). Note that this limitation usually leads to only a small underestimation of the statistical significance of the respective effects (see Siegel, 1956).
See Comparing Two Groups - Quick tab for further details.