Kruskal-Wallis test

Kruskal-Wallis test

The Kruskal-Wallis test is a generalized U-test for more than two groups. It tests H0 that data from k populations are not different.

Requirements:

Data must be ordinal (rank-order) scaled. Distribution is free

Idea:

The test works like the Mann-Whintey U-test. The data from all groups are brought together in one rank order. For each group the sum of ranks Ti and mean rank is then computed. Whereas the total sum of ranks is:

 

 

with

k = number of groups

N = Total Number of measurements

 

The test value H is computed as follows:

 

 

whereas

ni = sample size of group i

H is Chi-Square distributed with k-1 degrees of freedom

 

If there are tied ranks H is corrected as follows:

 

 

whereas

ti = Number of subjects sharing rank i

p = number of tied ranks

and

 

 


Post-hoc analysis

If the Kruskal-Wallis test is significant one probably wants to know which of the groups are different. BrightStat offers two different methods for post-hoc analysis:

The critical difference of the mean ranks after Conover (1971, 1980, 1999):

 

formula for post-hoc test after Conover for Kruskal-Wallis test

 

whereas

 

 = critical difference of mean ranks of group i and j

 

 = critical t-value with N-k degrees of freedom

 

ni = sample size of group i

nj = sample size of group j

 

The critical difference of the mean ranks after Schaich and Hamerle (1984):

 

formula for post-hoc test after Schaich and Hamerle for Kruskal-Wallis test

 

whereas

 

 = critical difference of mean ranks of group i and j

 

 = critical Chi-Square-value with k-1 degrees of freedom

 

ni = sample size of group i

nj = sample size of group j

 

The method after Schaich and Hamerle is exact but lacks a bit of power, whereas the method of Conover is approximative and more liberal.

 


Example of a Kruskal-Wallis test

A meteorologist has measured the amount of rain in four cities for six months. She wants to know if there are different amounts of rain in the four cities. The following table shows the raw data:

 

Cities
1
RANK 2
RANK
68
8
119
22
93
16
116
21
123
24
101
17
83
14
103
18
108
19
113
20
122
23
84
15




SUM
104

113
MEAN
17.33

18.83

 
Cities
3
RANK
4
RANK
70
10.5
61
5
68
8
54
1.5
54
1.5
59
3.5
73
12
67
6
81
13
59
3.5
68
8
70
10.5




SUM
53

30
MEAN
8.83

5

 

H is then computed as follows:

 

 

 

 

Because there are tied ranks H is corrected

 

 

 

the corrected H’ is then

 

 

The critical 5% Chi-Square with 3 degrees of freedom is 7.81

The observed test-value is greater than the critical Chi-Square, so there must be some differences in the amount of rain between the four cities.

We might be interested in the critical difference of the mean ranks so we can check which cities are different from each other. Because each group has 6 measurements we get one critical difference for all comparisons:

 

After Conover we get

 

 

 

and after Schaich and Hamerle we get

 

 

 

we can now compare the differences of the group mean ranks with the two critical differences:

 


CITY_1
CITY_2
CITY_3
CITY_2
-1.5
-
-
CITY_3
8.5 *
10 *
-
CITY_4
12.33 * °
13.83 * °
3.83

 

* significant difference after Conover

° significant difference after Schaich and Hamerle

 

BrightStat Output of this example

 



This is a fictitious example.


Wiki link


References

Bortz, J. (2005). Statistik für Human- und Sozialwissenschaftler (6th Edition). Heidelberg: Springer Medizin Verlag.

Conover, W.J. (1999). Practical nonparametric Statistics.(3rd edition). Wiley.

Kruskal, W.H. & Wallis, W.A. (1952). Use of ranks in one-criterion variance analysis. Journal of the American Statistical Association, 47 (260), 583 – 621.

Schaich, H.E. & Hamerle, A. (1984). Verteilungsfreie statistische Prüfverfahren, Berlin.




 

Gallery

 
 
map kinase