KruskalWallis
KruskalWallis Test
The KruskalWallis Test is a generalized UTest for more than two groups. It tests H0 that data from k populations are not different.
Requirements:
Data must be ordinal (rankorder) scaled. Distribution is free
Idea:
The test works like the MannWhintey UTest. The data from all groups are brought together in one rank order. For each group the sum of ranks T_{i} and mean rank is then computed. Whereas the total sum of ranks is:
with
k = number of groups
N = Total Number of measurements
The test value H is computed as follows:
whereas
n_{i} = sample size of group i
H is ChiSquare distributed with k1 degrees of freedom
If there are tied ranks H is corrected as follows:
whereas
t_{i} = Number of subjects sharing rank i
p = number of tied ranks
and
Posthoc analysis
If the KruskalWallis test is significant one probably wants to know which of the groups are different. BrightStat offers two different methods for posthoc analysis:
The critical difference of the mean ranks after Conover (1971, 1980, 1999):
whereas
= critical difference of mean ranks of group i and j
= critical tvalue with Nk degrees of freedom
n_{i} = sample size of group i
n_{j} = sample size of group j
The critical difference of the mean ranks after Schaich and Hamerle (1984):
whereas
= critical difference of mean ranks of group i and j
= critical ChiSquarevalue with k1 degrees of freedom
n_{i} = sample size of group i
n_{j} = sample size of group j
The method after Schaich and Hamerle is exact but lacks a bit of power, whereas the method of Conover is approximative and more liberal.
Example of a KruskalWallis Test
A meteorologist has measured the amount of rain in four cities for six months. She wants to know if there are different amounts of rain in the four cities. The following table shows the raw data:
CITY_1 
RANK  CITY_2 
RANK 
CITY_3 
RANK 
CITY_4 
RANK 
68 
8 
119 
22 
70 
10.5 
61 
5 
93 
16 
116 
21 
68 
8 
54 
1.5 
123 
24 
101 
17 
54 
1.5 
59 
3.5 
83 
14 
103 
18 
73 
12 
67 
6 
108 
19 
113 
20 
81 
13 
59 
3.5 
122 
23 
84 
15 
68 
8 
70 
10.5 
SUM 
104 
113 
53 
30 

MEAN 
17.33 
18.83 
8.83 
5 
H is then computed as follows:
Because there are tied ranks H is corrected
the corrected H’ is then
The critical 5% ChiSquare with 3 degrees of freedom is 7.81
The observed testvalue is greater than the critical ChiSquare, so there must be some differences in the amount of rain between the four cities.
We might be interested in the critical difference of the mean ranks so we can check which cities are different from each other. Because each group has 6 measurements we get one critical difference for all comparisons:
After Conover we get
and after Schaich and Hamerle we get
we can now compare the differences of the group mean ranks with the two critical differences:
CITY_1 
CITY_2 
CITY_3 

CITY_2 
1.5 
 
 
CITY_3 
8.5 * 
10 * 
 
CITY_4 
12.33 * ° 
13.83 * ° 
3.83 
* significant difference after Conover
° significant difference after Schaich and Hamerle
BrightStat Output of this example
This is a fictitious example.
Wiki link
References
Bortz, J. (2005). Statistik für Human und Sozialwissenschaftler (6^{th} Edition). Heidelberg: Springer Medizin Verlag.
Conover, W.J. (1999). Practical nonparametric Statistics.(3^{rd} edition). Wiley.
Kruskal, W.H. & Wallis, W.A. (1952). Use of ranks in onecriterion variance analysis. Journal of the American Statistical Association, 47 (260), 583 – 621.
Schaich, H.E. & Hamerle, A. (1984). Verteilungsfreie statistische Prüfverfahren, Berlin.