Ch13 Non-parametric tests

Non-parametric tests do not require assumptions about the distribution of the population or population parameters. Therefore, they are more general and they tend to be easier to explain and apply. Partly for this reason they have become popular.

# The one-sample sign test

This test concerns the median $\tilde{\mu}$ of a continuous population. The idea is that the probability of getting a value below the median or a value above the median is 1/2.
We test the null hypothesis

H0: $\tilde{\mu}\ = \ \tilde{\mu}_0$

against an appropriate alternative hypothesis

H1: $\tilde{\mu}\ \neq, >, < \ \tilde{\mu}_0$

We count the number of values in the sample that are above $\tilde{\mu}_0$ and denote them by the sign "+" and the ones falling below $\tilde{\mu}_0$ by the symbol "-".

For example, suppose that in a sample of students from a class the ages of the students are 23.5, 24.2, 19.2, 21, 34.5, 23.5, 27.7, 22, 38, 21.8, 25, 23. Test the claim that the median is less than 24 years of age with a significance level of $\alpha=$0.05.

The hypothesis is then written as:

H0: $\tilde{\mu}\ = \ 24$
H1: $\tilde{\mu}\ < \ 24$

The test statistic $x$ is then the number of plus signs. In this case we get:

- + - - + - + - + - + -

and therefore $x = 5$.

The variable $X$ follows a binomial distribution with n=12 (number of values) and $p$=1/2.
Therefore

(1)
\begin{align} P\{X\le 5\} = 0.0002+0.0029+0.0161+0.0537+0.1208+0.1934 = 0.3872. \end{align}

Since the p-value 0.3872 is larger than the significance level $\alpha=$0.05, the null-hypothesis cannot be rejected.
Therefore we conclude that the median age of the population is not less than 24 years of age. Actually in this particular class, the median age was 24, so we arrive at the correct conclusion.

# The sign test for large samples

When

(2)
\begin{equation} np>5 \end{equation}

and

(3)
\begin{equation} n(1-p)>5 \end{equation}

then the normal approximation of the binomial distribution is appropriate. Therefore we follow the same procedure as above but we substitute the test statistic with

(4)
\begin{align} z=\frac{x-np_0}{\sqrt{np_0(1-p_0)}} \end{align}

where

(5)
\begin{align} p_0=\frac{1}{2}. \end{align}

# Rank sums: the Wicoxon U-test

The Wilcoxon rank sum test is a nonparametric alternative to the Student's t-test for comparing two population means .

An alternative to the small-sample t-test concerning the difference of two means is the Wilcoxon test, also called the Mann-Whitney test. For this test the distribution of the critical value is given by a Wilcoxon U table.

Based on an example in the book, we look at the number of defective chairs produced by two different machines:

Machine 1: 24, 18, 23, 23, 27, 19

and

Machine 2: 22, 15, 18, 20, 19, 17

The means of the samples are 22.33 and 18.5 respectively; it remains to be seen if that difference is significant.

We want to test whether the means are different, so we state our hypothesis test with a significance level $\alpha = 0.05$:

H0: $\tilde{\mu_1}\ = \ \tilde{\mu_2}$
H1: $\tilde{\mu_2}\ \neq \ \tilde{\mu_2}$

## Test statistic U

To calculate the test statistic U, we first arrange the data jointly in one list identifying to which machine each number corresponds to and ranking the ties using the median of the tied values (for example the median of ranks 5 and 6 is 5.5):

 Sorted values Respective machine Rank 15 17 18 18 19 19 20 22 23 23 24 27 2 2 1 2 1 2 2 2 1 1 1 1 1 2 3.5 3.5 5.5 5.5 7 8 9.5 9.5 11 12

Therefore the ranks corresponding to machine 1 are 3.5, 5.5, 9.5, 9.5, 11 and 12, while the ranks corresponding to machine 2 are 1, 2, 3.5, 5.5, 7, and 8.

This gives rise to the following sum of the ranks variables

(6)
\begin{align} W_1 = \sum ranks_1 = 3.5+ 5.5+ 9.5+9.5+ 11 + 12 = 51 \end{align}

and

(7)
\begin{align} W_2 = \sum ranks_2 = 1+ 2+3.5+5.5+ 7 + 8 = 27 \end{align}

Notice that if the means where very different and for example if the first machine consistently produced fewer defective chairs, then all the ranks for machine one should be the numbers 1, 2, …, n1, whose sum is $\frac{n_1(n_1+1)}{2}$. Similarly if machine two produced consistently fewer defective chairs, the sum of its ranks would be $\frac{n_2(n_2+1)}{2}$. We account for the fact that therefore the value of $W_1$ and $W_2$ are at least the fractions mentioned above when defining the variables U1 and U2, which are at least zero:

(8)
\begin{align} U_1 = W_1-\frac{n_1(n_1+1)}{2} = 51-\frac{6(6+1)}{2} = 30 \end{align}

and

(9)
\begin{align} U_2 = W_2-\frac{n_2(n_2+1)}{2} = 27-\frac{6(6+1)}{2} = 6 \end{align}

The test-statistic U is the minimum of the two U-variables:

(10)
\begin{align} U = \min\{U_1,U_2\}=\min\{27,6\}=6 \end{align}

## Critical value $U_\alpha/2$

In this case, since the alternative hypothesis has the unequal sign (instead of less than or greater than sign), the significance probability is equally divided into two tails. We read the critical value read from the table Wilcoxon U distribution and obtain for n1 = 6 and n2=6 that

(11)
\begin{align} U_{\alpha} = 5. \end{align}

## Decision

The critical value is 5, and the test statistic is 6 which is larger than 5. In this case we accept the null-hypothesis $\tilde{\mu_1}=\tilde{\mu_2}$ when the test statistic U is less than or equal to the critical value read from the table [[$U_{\alpha}$/]].
Therefore we accept the null-hypothesis that the medians are equal.

We conclude that there is not sufficient evidence to conclude that the medians are different.

# Kruskal-Wallis H-test for completely randomized designs

The Kruskal-Wallis H-test is a nonparametric alternative to the analysis of variance F-test for comparing population means analysis of variance F-test for comparing population means.

Suppose you are comparing k populations based on independent random samples n1, n2, …, nk, from populations 1, 2, …, k, respectively, where

(12)
\begin{align} n_1+n_2+\dots\n_k = \sum n_i = n \end{align}

First, rank all n observations from smallest (rank 1) to largest (rank n). Ties are untied by finding the average of what they would have received if they were close, but not tied.
Then you calculate the rank sums T1, T2, …, Tk corresponding to the ranks for each of the k samples, and calculate the test statistic

(13)
\begin{align} H=\frac{12}{n(n+1)} \sum \frac{T_i^2}{n_i}-3(n+1) \end{align}

which is proportional to $\sum n_i (\bar{T_i} - \bar{T})^2$ the sum of the squared deviations of the rank means about the grand mean

(14)
\begin{align} \bar{T}=\frac{n(n+1)}{2n}=\frac{n+1}{2} \end{align}

The greater the differences in locations of the k populations, the larger the value of the H- statistic. For large n, H will follow approximately a k distribution with (k-1) degrees of freedom.

## Example: breakfast and children's attention span

We illustrate this test with the example given in chapter 10 about children's attention span.

For convenience we repeat the table below summarizing the results:

No Breakfast Light Breakfast Full Breakfast
8 14 10
7 16 12
9 12 16
13 17 15
10 11 12

and we recall the ANOVA calculations are given in the following Excel file.

An updated file that calculates the Kruskal-Wallis test is available here.

page revision: 45, last edited: 13 Dec 2010 18:59