A statistical hypothesis is an assertion or conjecture about a parameter (or parameters) of a population.
For example, the assertion that the mean body temperature of a healthy adult is 98.4 degrees Fahrenheit.
Procedure
To verify such an assertion statistically, one has to
- Make a study in which a simple random sample is selected and the value in question is collected for every subject in the sample (in this case the body temperature of the healthy adults).
- Set the null hypothesis H_{0} and the alternative hypothesis H_{A}.
- Find the sample statistics (in this case, the sample mean body temperature, standard deviation, and sample size).
- Fix a level of significance (for example 90%, 95% or 98%)
- Determine the corresponding critical value.
- Calculate the test statistic (see table).
- Decide whether the claim is accepted or rejected.
See the attached file for an example on calculating the hypothesis tests for difference of means.
Table for Hypothesis Tests
Hypothesis for | parameter | conditions | test statistic | critical values |
---|---|---|---|---|
Proportion | p | np>=5, nq>=5 | $z=\frac{\hat{p}-p}{\sqrt{pq/n}}$ | z-score table |
Mean (n>=30) | $\mu$ | $\sigma$ known, $n\ge 30$ | $z=\frac{\bar{x}-\mu}{\sigma/\sqrt{n}}$ | z-score table |
Mean (n<30) | $\mu$ | $\sigma$ known, $n< 30$ | $t=\frac{\bar{x}-\mu}{s/\sqrt{n}}$ | t-table, df=n-1 |
Standard deviation | $\sigma$ | normally dist. pop. | $\chi^2=\frac{(n-1)s^2}{\sigma^2}$ | chi-square table, df=n-1 | |
Comparing two means
Depending on the two sample sizes and scenario we mention 3 different tests to compare two means:
Hypothesis for | parameter | conditions | test statistic | critical values |
---|---|---|---|---|
Difference of Means (n>=30) | $\mu_1-\mu_2$ | $\sigma_1,\sigma_2$ known, $n\ge 30$ | $z=\frac{\bar{x}_1-\bar{x}_2-(\mu_1-\mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1}+ \frac{\sigma_2^2}{n_2}}}$ | z-score table |
Difference of Means (n<30) | $\mu_1-\mu_2$ | $n< 30$ | $t=\frac{\bar{x}_1-\bar{x}_2-(\mu_1-\mu_2)}{s_p\sqrt{\frac{1}{n_1}+ \frac{1}{n_2}}}$, where $s_p=\sqrt{\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}}$ | t-table, $df=n_1+n_2-2$ |
Difference of Means (paired data) | $\mu=0$ | $n< 30$ | $t=\frac{\bar{x}-\mu_0}{s/\sqrt{n}}$ | t-table, $df=n-1$ |
An example following the book for a difference of means is given here.
F-distribution
Some statistical tests use the F distribution, which requires two parameters. A table is available here. For example when comparing two population variances, an F-distribution is required. In that case, the test statistic is
(1)and the the F-critical value accepts two parameters which are
(2)and
(3)Below is the image of an F-distribution, obtained with R with the following code
df1 <- 3
df2 <- 15
x <- seq(0, 8, len = 401)
u <- qf(0.05, df1 = df1, df2 = df2, lower.tail = FALSE)
# This is the graph of the density:
plot(x, df(x, df1 = df1, df2 = df2), type = 'l', ann = FALSE)
# Now the code for filling the right-hand tail area
xx <- seq(u, 8, len = 401)
polygon(x = c(xx[1], xx), y = c(0, df(xx, df1, df2)), col = 'red')
# The figure looks better, if I fill the empty part of the x-axis:
lines(c(0, u), c(0, 0))
which was found here.
Analysis of Variance (ANOVA) for comparing multiple means
In order to compare the means of more than two samples coming from different treatment groups that are normally distributed with a common variance, an analysis of variance is often used.
The following table summarizes the calculations that need to be done, which are explained below:
Source | df | SS (sum of squares) | MS (mean squares) | F |
---|---|---|---|---|
Treatments | k-1 | SST (sum of squares of treatments) | MST = SST / (k-1) | MST/MSE |
Error | n-k | SSE (sum of squares of error) | MSE = SSE/(n-k) | |
Total | n-1 | Total SS |
Letting $x_{ij}$ be the jth measurement in the ith sample (where $j=1, 2, \dots, n$,
then
(4)and the sum of the squares of the treatments is
(5)where $T_i$ is the total of the observations in treatment i, $n_i$ is the number of observations in sample i and CM is the correction of the mean
(6)The sum of squares of the error SSE is given by
(8)and
(9)Example: Breakfast and children's attention span
An example for the effect of breakfast on attention span (in minutes) for small children is summarized in the table below:
No Breakfast | Light Breakfast | Full Breakfast |
---|---|---|
8 | 14 | 10 |
7 | 16 | 12 |
9 | 12 | 16 |
13 | 17 | 15 |
10 | 11 | 12 |
is given in Excel here.
The hypothesis test would be
(10)versus
(11)An alternative version of the calculation is given here in R:
Table = read.table("TableAnova2.txt", header=TRUE); Table
r = c(t(as.matrix(Table))) # response data
f = c("No Breakfast", "Light Breakfast", "Full Breakfast") # treatment levels
k = 3 # number of treatment levels
n = 5 # observations per treatment
mt = gl(k, 1, n*k, factor(f)) # matching treatment
av = aov(r ~ mt)
summary(av)
which reads the table TableAnova2.txt the same directory and gives the following result:
Df Sum Sq Mean Sq F value Pr(>F)
tm 2 58.533 29.2667 4.9326 0.02733 *
Residuals 12 71.200 5.9333
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
which indicates that the test statistic F is equal to 4.9326, as we obtained in Excel. The corresponding right-tail probability is 0.027, which means that if the significance level is 0.05, the test statistic would be in the rejection region, and therefore, the null-hypothesis would be rejected.
Hence, this indicates that the means are not equal, or in other words, that sample values give sufficient evidence that not all means are the same. In terms of the example this means that breakfast (and its size) does have an effect on children's attention span.