Level 2 – Reliability Analysis

Having established the validity of a CRT—including
item analyses of individual test items to improve them—reliability
analysis examines the test instrument in totality to determine whether
it is a consistent measure of whatever cognitive or performance
domain the test is designed to measure.
Test reliability is fundamentally related to "true
score"— the actual value of whatever attribute (knowledge,
skill, or ability) the test is designed to measure. A reliable test
will consistently give similar scores for test-takers who achieve
the same true score on the same test given to them at different
times. However, if there is no true score (i.e., the test is not
valid), then it is not measuring the attribute it is designed to
measure and, therefore, the test will not be reliable. In simple
terms, if the test is not measuring what it is designed to measure,
then the test cannot be expected to give consistent, reliable scores
over time. Essentially, test-takers' scores on the test would be
random, and therefore, unreliable.
Reasons for Evaluating Test Reliability
The first, and most basic, reason for evaluating test reliability
is because a valid test must be reliable. Because test
reliability is required for test validity, one of the first steps
in establishing test validity is demonstrating test reliability.
Therefore, as a practical matter, test reliability must be established
to provide some confidence that the test is valid in indicating
who has, or has not, mastered the knowledge or skills necessary
for successful performance on the job.
The second reason for formally evaluating test
reliability is that an unreliable test is indefensible under
a claim of adverse impact. If a test is not reliable, it cannot
be valid. So, an unreliable test should not be used for making employment
decisions. Moreover, if a claim of adverse impact is made, and the
party administering the test cannot show that the test under question
is reliable, the test will not be able to be used. If an employee
claims that they were unfairly hurt by a test, and a claim of reliability
can not be supported, then the claimant is likely to prevail. A
documented demonstration of reliability will support the use of
the test in employment decisions.
The PTG Level 2 system allows pilot test results
to be analyzed using worksheet-based coefficients for determining
the reliability of tests results. Coefficients vary depending on
the use of the test. Selection of test purpose/use automatically
brings the appropriate reliability analysis to bear in performing
reliability analysis of test results.
|