How we change what others think, feel, believe and do
The chi-square (c2) test measures the alignment between two sets of frequency measures. These must be categorical counts and not percentages or ratios measures (for these, use another correlation test).
Note that the frequency numbers should be significant and be at least above 5 (although an occasional lower figure may be possible, as long as they are not a part of a pattern of low figures).
Goodness of fit
A common use is to assess whether a measured/observed set of measures follows an expected pattern.
The expected frequency may be determined from prior knowledge (such as a previous year's exam results) or by calculation of an average from the given data.
The null hypothesis, H0 is that the two sets of measures are not significantly different.
The chi-square test can be used in the reverse manner to goodness of fit. If the two sets of measures are compared, then just as you can show they align, you can also determine if they do not align.
The null hypothesis here is that the two sets of measures are similar.
Chi-squared, c2 = SUM( (observed - expected)2 / expected)
c2 = SUM( (fo - fe)2 / fe )
...where fo is the observed frequency and fe is the expected frequency.
Note that the expected values may need to be scaled to be comparable to the observed values. A simple test is that the total frequency/count should be the same for observed and expected values.
In a table, the expected frequency, if not known, may be estimated as:
fe = (row total) x (column total) / n
...where n is the total of all rows (or columns).
The result is used with a Chi Square table to determine whether the comparison shows significance.
In a table, the degrees of freedom are:
df = (R - 1) * (C - 1)
...where R is the number of rows and C is the number of columns.
Goodness of fit
English test grade distributions have changed from last year, with grade B's somewhat lower. Is this significant?
The table below shows the calculation. First, the expected values are created by scaling last year's results to be equivalent to this year. Then the test statistic is calculated as SUM((O - E)^2/E).
Chi-square is found to be 12.1 and the degrees of freedom are (5-1) = 4 (there are five possible grades). Looking this up in the Chi Square table shows the probability is between 5% (9.49) and 1% (13.28), so H0 is adequately falsified and a significant change can be claimed.
A year group in school chooses between drama and history as below. Is there any difference between boys' and girls' choices?
Chi-square is 0.55. There are (2-1)*(2-1) = 1 degree of freedom. Checking the Chi Square table shows 0.55 is between 0.004 and 3.84, so no conclusion can be drawn about independence or similarity between boys' and girls' choices.
Chi-square is reported in the following form:
c2 (3, N = 125) = 10.2, p = .012
This test compares observed data with what we would expect to get (if the null hypothesis of no difference was true). It is based on the principle that if the two variables are not related (for example gender is not related to deafness) then measures applied to each variable will give similar results (for example about the same proportion of men and women being found to use a hearing aid), with any variation between the results being purely caused by chance. If the experimental measures are significantly different, then some relationship can be claimed.
A reason that percentages do not work is because they are fractions and low numbers will not work. In practice, you can often get away with percentages by converting them into larger numbers.
The measurement is unusual in that it has a square on numerator and a non-square on the denominator. Squaring removes negatives and exaggerates outliers. This increases the effect that chi-square has in analyzing the difference between two data sets.
Note that the test only reports whether two sets of figures are similar. It says nothing about the nature of the similarity.
A chi-gram is a bar-chart plot of a set of chi-square calculations and can visually show how chi-square varies across a set of related measurements.
Where variables are dichotomous (ie. can have only one of two values), then McNemar's Q is a similar test that is customized for this circumstance.
Note that this test is called the 'Chi-square' test, not 'Chi-squared'.
The Chi-square test is non-parametric.