PTG - Learning Measured  
CompanyExpertiseSolutionsPartnersProjects  

Training Assessment
Test Validation and Reliability
Organizational Impact Analysis
Business Needs Surveying

Home
Contact Us
Privacy Policy

News and Views

What Is a Test—and What is Not a Test?
A Review of the Literature

By Dr. Eric F. Grosse, Ed.D

Important Definitions
Understanding what is—and what is not—a test begins with some important definitions from the broader world of assessment. These definitions, and their authors, are listed below, in alphabetical order.

assess: systematically collecting information (including but not limited to quantitative data) without making judgments of worth (adapted from Shrock & Coscarelli, 1996, pg. 8).

criterion-referenced tests: tests that compare people against a standard (adapted from Shrock & Coscarelli, 1996, pg. 1).

evaluate: the process of making judgments regarding the appropriateness of some person, program, process or product relative to a specific purpose (Shrock & Coscarelli, 1996, pg. 8).

measure: collecting quantitative data about a specific activity, event, process, product or other observable phenomenon (my own definition).

norm-referenced tests: tests that compare people against each other; also known as standardized tests (adapted from Shrock & Coscarelli, 1996, pg. 2).

psychological test: a set of questions, problems, or tasks designed to elicit responses for use in measuring the traits, capacities, or achievements of an individual. Examples include intelligence tests, achievement tests, and aptitude tests (Aiken, 1998, pgs.1, 6, 10).

survey: a tool designed to gather information about an individual’s attitudes, preferences, behavioral intentions, observations, and opinions. Purposes include measuring employee relations, morale, and involvement; predicting organizational outcomes; segmenting a population based on common characteristics; and comparing employee perspectives across organizations (adapted from Kraut, 1996, pg. 20-21, 24, 25).

test: a deliberate attempt by people to acquire information about themselves or others (Westgaard, 1999, pg.1).

testing: the process of collecting quantitative information about the degree to which a competence or ability is present in a test taker, based on responses to questions where there are right and wrong answers (adapted from Shrock & Coscarelli, 1996, pg. 7).

Testing in a Corporate or Government Setting
Testing is widespread in most corporate and government settings. Specific to training, however, there are several conditions and/or limitations to testing that deserve attention.

  1. A survey is not a test. Surveys merely gather information; they don’t have an evaluation component as has been defined above. Kirkpatrick Level 1 instruments are properly classified as surveys.
  2. All tests are not created equal. As the definitions for criterion and norm-referenced tests indicate, there are enormous differences between these two major types of tests. If these test types are used inappropriately or interchanged, there are serious consequences.
  3. All tests first must be validated. Validation is a process that determines, among other things, that what is asked on a test accurately reflects the content that was taught ("content validity") and makes sense to an "educated" test-taker ("face validity").
  4. Tests can be used to evaluate personal mastery only after they have been validated. Un-validated tests are routinely—and successfully—challenged by labor unions and in court when the result has been to deny an employee a promotion, job transfer, or other desirable outcome, or has been used to discharge an employee deemed by test results to be incompetent.
  5. Tests used to evaluate personal mastery must be based on competency statements tied to specific job tasks. This is, in essence, another level of test validity known as criterion validity. Tests that are simply requirements for employees to demonstrate successful memorization of facts, figures, relationships, etc. generally don’t pass the standard of criterion validity.
  6. Tests that meet the standards of face, content, and criterion validity and are deemed reliable across time and essentially equal populations can be used successfully by any organization. When the proper developmental standards for a test have been met—and employees who don’t pass a test are provided an opportunity for remediation and re-testing—most organizations are willing to use the results of tests as a piece of the employee performance appraisal process.

Main Advantages and Disadvantages of Different Types of Assessment Instruments

Type of Assessment
Instrument

Advantages Disadvantages
Ability tests
  • Mental ability tests are among the most useful predictors of performance across a wide variety of jobs
  • Are usually easy and inexpensive to administer
  • Use of ability tests can result in high levels of adverse impact
  • Physical ability tests can be costly to develop and administer
Achievement/
proficiency tests
  • In general, job knowledge and work-sample tests have relatively high validity
  • Job knowledge tests are generally easy and inexpensive to administer
  • Work-sample tests usually result in less adverse impact than ability tests and written knowledge tests
  • Written job knowledge tests can result in adverse impact
  • Work-sample tests can be expensive to develop and administer
Biodata inventories
  • Easy and inexpensive to administer
  • Some validity evidence exists
  • May help to reduce adverse impact when used in conjunction with other tests and procedures
  • Privacy concerns may be an issue with some questions
  • Faking is a concern (information should be verified when possible
Employment interviews
  • Structured interviews, based on job analyses, tend to be valid
  • May reduce adverse impact if used in conjunction with other tests
  • Unstructured interviews typically have poor validity
  • Skill of the interviewer is critical to the quality of interview (interviewer training can help)
Personality inventories
  • Usually do not result in adverse impact
  • Predictive validity evidence exists for some personality inventories in specific situations
  • May help to reduce adverse impact when used in conjunction with other tests and procedures
  • Easy and inexpensive to administer
  • Need to distinguish between clinical and employment-oriented personality inventories in terms of their purpose and use
  • Possibility of faking or providing socially desirable answers
  • Concern about invasion of privacy (use only as part of a broader assessment battery)
Honesty/integrity
measures

  • Usually do not result in adverse impact
  • Have been shown to be valid in some cases
  • Easy and inexpensive to administer
  • Strong concerns about invasion of privacy (use only as part of a broader assessment battery)
  • Possibility of faking or providing socially desirable answers
  • Test users may require special qualifications for administration and interpretation of test scores
  • Should not be used with current employees
  • Some states restrict use of honesty and integrity tests
Education and experience
requirements

  • Can be useful for certain technical, professional, and higher level jobs to guard against gross mismatch or incompetence
  • In some cases, it is difficult to demonstrate job relatedness and business necessity of education and experience requirements
Recommendations and reference checks
  • Can be used to verify information previously provided by applicants
  • Can serve as protection against potential negligent hiring lawsuits
  • May encourage applicants to provide more accurate information
  • Reports are almost always positive; they do not typically help differentiate between good workers and poor workers
Assessment centers
  • Good predictors of job and training performance, managerial potential, and leadership ability
  • Apply the whole-person approach to personnel assessment
  • Can be expensive to develop and administer
  • Specialized training required for assessors; their skill is essential to the quality of assessment centers
Medical examinations
  • Can help ensure a safe work environment when use is consistent with relevant federal, state, and local laws
  • Cannot be administered prior to making a job offer
  • Restrictions apply to administering to applicants postoffer or to current employees
  • There is a risk of violating applicable regulations (a written policy, consistent with all relevant laws, should be established to govern the entire medical testing program)
Drug and alcohol tests
  • Can help ensure a safe and favorable work environment when program is consistent with relevant federal, state, and local laws
  • An alcohol test is considered a medical exam and applicable law restricting medical examination in employment must be followed
  • There is a risk of violating applicable regulations (a written policy, consistent with all relevant laws, should be established to govern the entire drug or alcohol testing program)

Checklist For Evaluating a Test

  • Characteristic to be measured by test (skill, ability, personality trait)
  • Job/training characteristic to be assessed
  • Candidate population (education, or experience level, other background)
  • Test name
  • Version
  • Type (paper-and-pencil, computer)
  • Alternate forms available?
  • Scoring method (hand-scored, machine-scored)
  • Technical considerations
  • Reliability: r=
  • Validity: r=
  • Reference/norm group
  • Test fairness evidence
  • Adverse impact evidence
  • Applicability (indicate any special group)
  • Administration considerations
  • Administration time
  • Materials needed (include start-up costs, operational and scoring cost)
  • Costs
  • Facilities needed
  • Staffing requirements
  • Training requirements
  • Other considerations (consider clarity, comprehensiveness, utility)
  • Quality of Test manual
  • Supporting documents available from the publisher
  • Quality of publisher assistance
  • Independent reviews
  • Overall evaluation

References:

  1. Aiken, Lewis R. (1998). Tests & Examinations: Measuring Abilities and Performance. New York: John Wiley & Sons, Inc.
  2. Kraut, Allen I. (Ed.). (1996). Organizational Surveys: Tools for Assessment and Change. San Francisco: Jossey-Bass.
  3. Shrock, Sharon A. and William C.C. Coscarelli. (1996). Criterion-Referenced Test Development: Technical and Legal Guidelines for Corporate Training. Washington DC: International Society for Performance Improvement.
  4. Westgaard, Odin. (1999). Tests That Work: Designing and Delivering Fair and Practical Measurement Tools in the Workplace. San Francisco: Jossey-Bass Pfeiffer.
  5. [no author]. (1999). Testing and Assessment: An Employer’s Guide to Good Practices. Washington DC: U.S. Department of Labor, Employment & Training Administration.