How we change what others think, feel, believe and do
Sampling has its own set of terms it uses. Here is a brief description of these.
A population is the total group of people about who you are researching and about which you want to draw conclusions.
It is common for variables in the population being denoted by Greek letters and for those in the sample to be shown by Latin letters. For example standard deviation of the population is often shown with s (sigma), whilst of a sample is 's'. Sometimes as an alternative, capital letters are used for the population.
The list of people from whom you draw your sample, such as a phone book or 'people shopping in town today', may well be less than the entire population and is called a sample frame. This must be representative of the population otherwise bias will be introduced.
Sample frames are usually much larger than the sample. They are used because of convenience and the difficulty of accessing people outside this frame (for example those without a telephone).
When the population is large or generally inaccessible (such as the population of Birmingham) then the approach used is to measure a subset or sample.
A unit is the thing being studied. Usually in social research this is people. There may also be additional selection criteria used to choose the units to study, such as 'people who have been police officers for at least five years.'
In order to be representative of the population, the sample must be large enough. There are calculations to help you determine this. The required sample size depends on the homogeneity of the population, as well as its total size.
After sampling you then generalize in order to make conclusions about the rest of the population.
A valid sample is both big enough and is selected without bias so it is representative of the population.
Bias, a distortion of results, is the bugbear of all research and it can be introduced by taking a sample that does not truly represent the population and hence is not valid.
Having drawn the sample, these may be assigned to different groups.
A common grouping is an experimental group which receive the treatment under study and a control group that gives a standard against which experimental results can be compared. To sustain internal validity, this is usually random assignment. Non-random assignment is sometimes ok, for example where two school classes are selected as coherent groups and one chosen as the control.
When there a sample of n people are selected from a population of N, then the sampling fraction is calculated as n/N. This may be expressed as a number (eg. 0.10) or a percentage (eg. 10%).
If the sample is described as a histogram (a bar chart showing numbers in different measurement ranges) it will have a particular shape. Multiple samples should have similar shapes, although random variation means each may be slightly different. The larger the sample size, the more similar sample distributions will be.
This is the standard error for the sample distribution and measures the variation across different samples. It is based on the standard deviation of the sample and the gap between this and the standard deviation of the population. Larger sample sizes will lead to a smaller sampling error.
An estimate calculation for a single sample is:
sm = sx / sqrt(N)
A systematic error is one caused by human error during the design or implementation of the experiment.
Strata (singular: stratum) are sub-groups within a population or sample frame. These can be random groups, but often are natural groupings, such as men and women or age-range groups. Stratification helps reduce error. See stratified random sampling for usage.
Oversampling occurs when you study the same person twice. For example if you selected people by their telephone number and someone had two phone numbers, then you could end up calling them twice. This can cause bias.