How we change what others think, feel, believe and do
The Test Statistic
When you do any test, there will be variation in results. This may be intended or unintended, which is often referred to as systematic and unsystematic variance.
Systematic variance is that due to deliberate experimental actions. For example the 'after' score may be different to the 'before' score, the 'control' score or some segmentation of subjects (eg. male/female).
Systematic variance is generally measures as the difference between groups, for example comparing the means of a set of samples.
Systematic variance is often denoted as SSM, where 'M' stands for 'Model'. (An easier way of remembering it is that it is what was Meant to be).
Unsystematic variance is that which is unintended and is a particularly tricky problem in social research. People are not like physical objects. The same person might answer the same question differently on different days. They might understand a lesson differently depending on what other stressors there are in their lives.
It is because of this that we do social experiments and pay a lot of attention to systematic vs unsystematic variance. It would otherwise be too easy to draw conclusions based more on coincidence than reliable fact.
Unsystematic variance is generally measured as the variation within groups, across a group of subjects where you might hope that a similar set of scores are found. This is typically calculated as the sum of the squares, SS, standard error, or another measure of spread.
Unsystematic variance is often denoted as SSR, where 'R' stands for 'residual' (ie that which is left over when the systematic variance is removed from the total variance. An easy way of remembering this is that it is also due to Random effects.
Systematic variance is desirable, whilst unsystematic variance is not desirable and can obscure the systematic variance you are seeking. This is like radio waves, where the signal carrying the radio signal can be obscured by random noise that the signal picks up as it travels through the air and through imperfect electronic equipment.
A way of measuring the quality of a radio signal is by the 'signal-to-noise' ratio. Thus if you have a big signal and small noise, then it will sound great, whilst if the noise is bigger than the signal (and signal-to-noise is less than one) the the radio will be very hissy.
This principle is used in test statistics to determine the quality of the experimental results, dividing the desirable systematic variance (the 'signal') by the unsystematic variance (the 'noise').
Test statistics are often calculated as the ratio between systematic and unsystematic variance
Test statistic = systematic variance / unsystematic variance
When variance is measures as the sum of squares, SS, then this is:
Test statistic = SSM / SSR
And where total variance is
SST = SSM + SSR
For example, the t-test statistic is measured as the difference between sample means
t = (difference between sample means) / (difference between standard errors of samples)
The F-ratio for ANOVA is measured as a ratio of mean sum of squares :
F-ratio = MSM / MSR
...where MS = SS / degrees of freedom
Test statistics can also be calculated as a proportion of a whole, which gives a result that lies between 0 and 1 (or -1 and +1).
Correlation coefficients typically use proportion. The Pearson r-value, for example is calculated
r = SQRT( explained variation / total variation )
Proportions are often easier to interpret than signal-to-noise numbers, which can be very high or low. A result close to 1 for r, for example, means the explained variation makes up most of the variation and hence correlation is good, with unexplained variance being low.
Because both numerator and denominator can vary between 0 and a large positive number, the test statistic can vary between 0 and infinity.
When the statistic is <1, then unsystematic variance (the denominator) is greater than the systematic variance, which is usually bad news, as it means you 'can't see the wood for the trees' as random effects swamp intended ones.
In contrast, a large positive statistic means that most of the variance is due to intended effects. A test statistic of 10 means that 10/11 of the variance is due to designed, systematic effects.
A note: the statistic could be measured as (systematic variance / total variance), which would give a percentage figure. However this could cause practical problems and is hence not used.
The actual probability of the effect is generally found by using a table that combines probability and degrees of freedom, such as the t-test table.
A 'significant' result is can usually be claimed if the probability is 95% or more.
Note that significance does not indicate the size of the change (and hence whether it is particularly important or meaningful). For this, effect may be calculated.
And the big