Reliability and Validity of Measures
How we evaluate measurement quality: reliability coefficients, validity evidence, and item-level statistics.
Slide 1 of 4
Reliability
- Reliability is the consistency of measurement across items, time, or raters.
- Internal consistency (Cronbach's alpha), test–retest, and inter-rater are common forms.
- Reliability sets a ceiling on validity: an unreliable measure cannot be valid.
Slide 2 of 4
Validity evidence
- Content validity: items adequately sample the construct's domain.
- Criterion validity: scores relate to an external criterion (concurrent or predictive).
- Construct validity: convergent and discriminant evidence support the intended meaning.
Slide 3 of 4
Item statistics
- Item difficulty (p) is the proportion endorsing or answering correctly.
- Item discrimination indexes how well an item separates high and low scorers.
- Item-total correlations flag items that do not cohere with the scale.
Slide 4 of 4
Norms and standardization
- Norm-referenced scores compare a person to a reference sample.
- Percentiles, z-scores, and T-scores are common standardized metrics.
- Norms are only valid for populations resembling the standardization sample.