How we change what others think, feel, believe and do

# Principles in analysis

Explanations > Social ResearchAnalysis > Principles in analysis

There are several principles that appear across the many different analysis calculations.

## Averaging

Given a set of measures of things which are the same in some way, the actual figures are likely to vary, due to other, differing factors. A simple way to summarize the set of measures is to average them (take the mean).

The benefit of an average is that unusually low numbers are likely to cancel out unusually high numbers, resulting in the average being fairly central and hence an approximation of what you might expect the 'true measure' to be (if all other variation could be eliminated).

## Comparing

Much analysis involves comparing two sets of figures to understand any difference between them. Typically, it is often important to know whether the two sets of figures are different in some way.

• Two types of people are different in some way.
• Treating a set of people changes them in some way (before-and-after tests).

When there are a set of samples, it is common to compare the means (averages). For example the t-test compares the means of two sets of samples to determine whether there is a significant difference between them.

Comparison is easiest when you do it in pairs, comparing A with B. Comparing can also be done for more than two groups at once, for example using ANOVA.

Comparison can be used to analyze:

• A single set of measures (eg. comparing each measure against the mean).
• Two sets of measures
• A single set of measure against a standard, such as the Normal distribution.

Comparing against a standard is useful because this allows additional meaning to be inferred and additional actions to be used. For example if a set of measures is found to be comparable with a Normal distribution then parametric analysis may be used.

## Difference

A problem with comparing numbers is that they can be all over the place and meaning difficult to infer. A basic method of comparison is to subtract one number from another, thus highlighting a gap. This is done in two ways.

First, an anchor point can be identified, and all other measures subtracted from this. For example with a single sample, each measure may be subtracted from the mean.

A difference may also be used in a single formula, for example to highlight a gap between the averages of two sets of measures (and maybe hence show that the tow sets of figures are different).

Where there are two matched sets of figures, then each number in one set can be subtracted from its corresponding number in the other set.

The table below shows differences between a set of measures and its mean, and pairs of numbers.

 X X - x-bar Y X - Y 1 -5.6 4 -3 10 3.4 14 -4 4 -2.6 10 -6 6 -0.6 2 4 9 2.4 7 2 2 -4.6 1 1 6 -0.6 13 -7 19 12.4 4 15 7 0.4 0 7 2 -4.6 3 -1 Mean (x-bar) 6.6

## Squaring

The problem with differences is that you get negative and positive numbers which, when summed will cancel one another out and may tend towards zero or otherwise be unrepresentative of the set of numbers. Summing is a part of averaging, and hence you cannot get a useful mean of a simple set of differences.

Squaring the numbers before summing them is a useful way of making all the numbers positive. It also has the effect of exaggerating the larger numbers, which is no bad thing when bigger differences are more significant.

The problem with squared numbers is that they are not the same as the original numbers, and can give strange units (what is years squared?). Numbers which are squared often subsequently have a square root applied to bring them back to the original unit and facilitate sense-making.

The table below shows the effect of squaring and taking square roots of means and sums.

 X X - x-bar (X - x-bar)^2 Y X - Y (X - Y)^2 1 -5.6 31.4 4 -3 9.0 10 3.4 11.6 14 -4 16.0 4 -2.6 6.8 10 -6 36.0 6 -0.6 0.4 2 4 16.0 9 2.4 5.8 7 2 4.0 2 -4.6 21.2 1 1 1.0 6 -0.6 0.4 13 -7 49.0 19 12.4 153.8 4 15 225.0 7 0.4 0.2 0 7 49.0 2 -4.6 21.2 3 -1 1.0 Mean (x-bar) 6.6 0.0 25.24 5.8 0.8 40.6 Square root 5.0 6.4 Sum 0.0 252.4 8.0 406.0 Square root 15.9 20.1

## Proportion

There are two basic ways of comparing: subtracting (covered above) and dividing. We can thus say that my IQ is 75% of the population average (and hence is a good forecast). Thus you can show the proportion of a population or other 'total' with the simple division:

Proportion = actual / total

Another way of showing proportion is by dividing the difference between the first and second measure by the second measure:

p = (x - y) / y

For example if the the average IQ is 120 and my IQ is 90, then:

p = (90 - 120) / 120 = -30 / 120 = -0.25

In other words, my IQ is 25% below average.

This is used in the z-score, which converts measures into standard deviations:

z = (x - s) / s

Proportions are often expressed as percentages .Most people understand percentages intuitively and 90% or 25% has clear meaning.

Percentage can also be expressed 'in reverse', with 90% also being expressed as 10%. It says the same thing, but can be used to make different emphasis. This form is calculated as:

Percentage = (X - Y) / X

Or, percentage = 1 - Y/X

Notice how percentages neatly use both subtraction and division in a combined form of comparison.

## Ratio

A further way of comparing by division is to divide two different items (such as 'fruit per tree') to give a new unit. This can then be used to compare 'apples and oranges' different things.

This eliminates the problem of negatives that subtracting creates, although it can bring other problems in that the range of results can be huge (dividing by zero gives infinity). Ratios thus need to be treated with care (and is a reason why proportions are used, as these only range from 0 to 1 (or -1 to 1 if one number can be negative).

The general Test Statistic typically uses a ratio based on the principle of comparing that which is expected and desirable with that which is unexpected and undesirable.

Test statistic = systematic variance / unsystematic variance

And the big
paperback book