changingminds.org

How we change what others think, feel, believe and do

 

Disciplines

 

Techniques

 

Principles

 

Explanations

 

Theories

 

 

Home

 

Blog!

 

Quotes

 

Guest articles

 

Analysis

 

Books

 

Help us

 

Links

 

 

Please help
and share:

 

Reliability

 

Disciplines > Human Resources > Selection > Reliability

Definition | Stability | Consistency | See also

 

Definition

If a test is unreliable, then although the results for one use may actually be valid, for another they may be invalid. Reliability is thus a measure of how much you can trust the results of a test.

Tests often have high reliability – but at the expense of validity. In other words, you can get the same result, time after time, but it does not tell you what you really want to know.

Stability

Stability is a measure of the repeatability of a test over time, that it gives the same results whenever it is used (within defined constraints, of course).

Test-retest reliability is the repeatability of test over time to get same results with the same person and needs to be done to assure the stability of a test. Stability, in this case, is the variation in the scores that is taken. Problems with this include:

  • Carry-over effect: people remembering answers from last time.
  • Practice effect: repeated taking of test improves score (typical with classic IQ tests).
  • Attrition: People not being present for re-tests.

There is an assumption with stability that what is being measured does not change. Variation should be due to the test, not to any other factor. Sadly, this is not always true.

Consistency

Consistency is a measure of reliability through similarity within the test, with individual questions giving predictable answers every time.

Consistency can be measured with split-half testing and the Kuder-Richardson test.

Split-half testing

Split-half testing measures consistency by:

  • Dividing the test into two (usually a mid-point, odd/even numbers, random or other method)
  • Administering them as separate tests.
  • Compare the results from each half.

A problem with this is that the resultant tests are shorter and can hence lose reliability. Split-half is thus better with tests that are rather long in the first place.

Use Spearman-Brown’s formula to correct problems of shortness, enabling correlation as if each part were full length:

 

r = (2rhh)/(1 + rhh)
 

(Where rhh is correlation between two halves)
 

Kuder-Richardson reliability or coefficient alpha

The Kuder-Richardson reliability or coefficient alpha is relatively simple to do, being  based on one administration of the test. It assesses inter-item consistency of test by looking at two error measures:

  • Adequacy of content sampling
  • Heterogeneity of domain being sampled

It assumes reliable tests contain more variance and are thus more discriminating. Higher heterogeneity leads to lower inter-item consistency. For right/wrong scores that are non-dichotomous items:
 

Rkk = k / (k – 1(1 – Σσ2i/σ2t))
 

Where Rkk is alpha coefficient of test, k is number of items, σ2i is item variance, σ2t is test variance
 

Equivalence of results (parallel form)

Seeks reliability through equivalence between two versions of the same test, comparing results from each version of test (like split-half). It is better than test-retest as it can be done the same day (reducing variation).

There is a danger of tests with high internal validity having limited coverage (and hence lower final validity).

Bloated specifics are where similar questions lead to apparent significance. This can be bad when unintended, but can be used to create deliberate variations.

Parallel versions are useful in such situations as with graduates who may do the same test several times.
An adverse effect occurs where different groups score differently (potential racial, etc. bias). This may require different versions of the same test – eg. MBTI for different countries.
 

Discussion

There are a number of procedural aspects that affect test reliability, including:

  • Test conditions
  • Inconsistent administrative practices
  • Variation in test marking
  • Application of an inappropriate norm group
  • Internal state of test-taker (tired, etc.)
  • Experience level of test-taker (eg. if taken test before).

See also

Validity, Types of reliability

Kaplan, R.M. and Saccuzzo, D.P. (2001). Psychological Testing: Principle, Applications and Issues (5th Edition), Belmont, CA: Wadsworth

More Kindle books:

And the big
paperback book


Add/share/save:


 

 


Save the rain


 

 


SalesProCentral

 

Contact Caveat About Students Webmasters Awards Guestbook Feedback Sitemap Changes

 

 

Quick links

Disciplines

* Argument

Brand management

* Change Management

Coaching
+
Communication

Counseling

Game Design

+ Human Resources

+ Job-finding

* Leadership

Marketing

Politics

+ Propaganda

+ Rhetoric

* Negotiation

* Psychoanalysis

* Sales

Sociology

+ Storytelling

+ Teaching

Warfare

Workplace design

 

Techniques

+ Assertiveness

* Body language

* Change techniques

* Closing techniques

+ Conversation

Confidence tricks

* Conversion

* Creative techniques

* General techniques

+ Happiness

+ Hypnotism

+ Interrogation

* Language

+ Listening

* Negotiation tactics

* Objection handling

+ Propaganda

* Problem-solving

* Public speaking

+ Questioning

Using repetition

* Resisting persuasion

+ Self-development

Sequential requests

Stress Management

* Tipping

Using humor

* Willpower

Principles

+ Principles

Explanations

* Behaviors

+ Beliefs

Brain stuff

Conditioning

+ Coping Mechanisms

+ Critical Theory

+ Culture

Decisions

* Emotions

Evolution

Gender

+ Games

Groups

+ Identity

+ Learning

Meaning

Memory

Motivation

+ Models

* Needs

+ Personality

+ Power

* Preferences

+ Research

Relationships

+ SIFT Model

+ Social Research

Stress

+ Trust

+ Values

Theories

* Alphabetic list

* Theory types

 


  Changing Minds 2002-2013

  Massive Content -- Maximum Speed

TOP