Paper Example Undergraduate 662 words

Validity vs. Reliability When One

Last reviewed: February 23, 2011 ~4 min read

Validity vs. Reliability

When one is discussing experimental results, two terms become very important: validity and reliability. Validity refers to "the extent to which a test measures what it claims to measure" (Cherry, 2011, Validity). Reliability refers to "the consistency of a measure. A test is considered reliable if [it yields] the same result repeatedly (Cherry, 2011, Reliability). Both measures are critically important because a test is useless as a predictive tool if it does not have both reliability and validity. However, the fact that both measures are important does not mean that they are of equal importance. On the contrary, even the most reliable test is not a helpful tool if it does not measure what it purports to measure. Furthermore, depending on what is being measured, it is unlikely that the construct itself is static; therefore it might be unrealistic to expect a test to be reliable. Therefore, it seems clear that validity is more important than reliability.

The main benefit of a reliable test is that the test is consistent. However, the main drawback to determining reliability is the fact that human beings are inconsistent. So many factors can impact how a human being performs on a test that one does not expect consistent results. Instead, results are considered consistent if they repeatedly fall within a range of results. Test-retest reliability refers to "the consistency of a test across time" (Cherry, 2011, Reliability). However, only constructs that are likely not to change over time would be expected to show this type of reliability. For example, a test that measures a mental state, such as depression, would not be expected to have test-retest reliability over a long period of time. However, a test that measures a construct that is considered more stable, such as general intelligence, would be expected to have test-retest reliability. Another type of reliability is inter-rater reliability, which means that the test is scored similarly by two independent judges (Cherry, 2011, Reliability). Parallel-forms reliability refers to the reliability between tests that were created using the same content (Cherry, 2011, Reliability). Finally, internal consistency reliability looks at items in the same test, to see if they measure the same construct in the same way (Cherry, 2011, Reliability). However, all of these measures of reliability are useless if a test does not measure what it purports to measure.

Validity looks at whether a test measures what it claims to measure. Only valid tests can be used to be accurately applied or interpreted (Cherry, 2011, Validity). There are three different types of validity: content validity, criterion-related validity, and construct validity. Content validity means that "the items on the test represent the entire range of possible items the test should cover" (Cherry, 2011, Validity). Criterion-related validity means that the test can predict criterion or indicators of a construct. Concurrent validity means that the "test scores accurately estimate an individual's current state with regards to the criterion" (Cherry, 2011, Validity). Predictive validity means that a test is helpful in determining how a person is likely to respond (Cherry, 2011, Validity). Construct validity means that a test "demonstrates an association between the test scores and the prediction of a theoretical trait" (Cherry, 2011, Validity).

You’re 84% through this paper. Sign up to read the full paper.

130,000+ paper examples AI writing assistant Citation generator Cancel anytime