PTG - Learning Measured  
CompanyExpertiseSolutionsPartnersProjects  

Training Assessment
Test Validation and Reliability
Organizational Impact Analysis
Business Needs Surveying

Home
Contact Us
Privacy Policy
News

Information ReleasesWhite Papers

Automated Multi-Level Training Assessment Programs
D.L. Kirkpatrick Model

I. BACKGROUND

In order to achieve training excellence through continuous improvement, thus shortening the time between hire and the development of operationally acceptable, mission critical competencies, government agencies recognize the critical need for comprehensive and timely assessment of training. Proper evaluation of training helps ensure that trainers, site training supervisors, managers, and executives are able to make informed decisions about training. These decisions impact the effectiveness of courses and their delivery, as well as the degree to which courses / curricula increase knowledge, enhance skills, and build competencies.

Additionally, The Federal Workforce Flexibility Act of 2004 requires Federal agencies to regularly evaluate and modify training programs or plans in order to promote a more strategic approach to agencies’ integration of training plans into overall mission accomplishment, and to provide specific training to develop managers as part of a comprehensive management succession program. The Act adds the requirement that each agency, on a regular basis, evaluate each of its training plans or programs as to how that plan or program accomplishes or effectively promotes the agency’s specific performance plans and strategic goals. Recent forays by the General Accountability Office have disclosed specific interest on the part of auditors in agencies’ compliance with the provisions of GAO publication GAO-04-546-G, “A Guide to Assessing Strategic Training and Development efforts in the Federal Government”, a significant portion of which deals with an agency’s ability to demonstrate how training and development efforts contribute to improved performance and results.

II. A TRAINING EVALUATION MODEL

The training evaluation model described in this White Paper is based on the work of D.L. Kirkpatrick. This model—a four-level, interdependent hierarchy of levels of assessment—is widely used in government and private industry. Pioneering efforts by Jack Phillips have resulted in acceptance of a 5th level–ROI calculation–frequently being mentioned as a goal as well.

Thorough analysis of training after it is conducted is the key to identifying actions required to improve training outcomes. This analysis should focus on all aspects of the training, from Training Delivery, to Organizational Results or Impact, following the 4 evaluation levels specified in the Kirkpatrick Training Evaluation Model, plus an additional level centered around determining the cost-effectiveness of training administered to an organization.

Level 1 (Reaction)
Evaluation requires that data be collected as soon as possible after the training is conducted. The best results are obtained when the trainee – as an integral part of the course of instruction – is allowed or required (depending on the union or regulatory environment) to complete a carefully designed training evaluation form, either manually or on a workstation. The latter is particularly well suited to automated course administration using computer-resident or web-resident software. Data collected are valuable to all parts of the education community and its customers. Customers/clients, course deliverers and developers - all contribute to effective training and thus have a need to review and monitor evaluation data received. Managers and executives also contribute to the data analysis process and to implementing outcome improvements.

Level 2 (Achievement)
Evaluation requires the use of testing to determine the extent to which trainees changed attitudes, improved knowledge or increased skills. This evaluation requires that procedures for test creation, validation, administration and scoring be in place, to assure that tests administered are in conformity with accepted principles of test validation, are administered in a consistent manner, and produce reliable results.

Level 3 (Performance)
Evaluation determines the extent to which the training received has transferred to the workplace. It requires surveying the trainee and manager population that has undergone training, to determine the extent to which changes in behavior and job performance – competency improvements – have occurred as a result of the training.

Level 4 (Organizational Impact)
Evaluation measures mission-related outcomes indicative of training success (i.e., percentage of customer calls completed within 2 minutes of call-pickup), as a result of organization members having attended a training program.

Level 5 (Return on Investment)
Evaluation, while not a Kirkpatrick-devised level of evaluation, is being used in private industry and considered for use in government. However, in the absence of objective financial metrics in the government environment, alternative non-financial approaches are also being considered.


III. AN APPROACH TO AUTOMATED TRAINING EVALUATION

The automated evaluation program described in this document consists of a centralized training evaluation process based on collecting Level 1 Reaction data from class evaluations; Level 2, Achievement test results from paper or online tests administered throughout the country or On-the-Job Training Daily Observation Reports from Training Officers, and Level 3, Performance survey data from training courses. Level 2 and 3 data, together with measured differences in organizational performance indicators, can then be used to measure the value and worth of training (Level 4).

Level 1–Reaction data typically includes “learnability” data collected from trainees (e.g., learning preferences, instructional delivery, job-relatedness, training efficacy, and instructor competence) and ‘teachability ‘ data collected from instructors (e.g., adequacy of materials and the instructional environment, preparedness of trainees) as well as other reactions to the training event and its delivery. As many of the training sites are/will be remote from the headquarters training location, much of the training will be delivered via computer media, either off line or Internet. This, in turn, requires that reaction data be collectable by a variety of means; including paper scan forms, electronically distributed surveys, telephone data entry, or access and completion of online training assessment forms.

Level 2–Achievement data is collected from validated tests administered to determine the degree to which trainees changed attitudes, improved knowledge or increased skills. Implementation of this level of assessment requires that procedures for test creation, validation, administration and scoring be in place, to assure that tests administered are in conformity with accepted principles of test validation, are administered in a consistent manner, and are scored reliably and consistently, and the results collected for use in curriculum improvement, training program assessment and human resources decision making. The highly dispersed and repetitive nature of many of the behaviors and procedures involved in conducting field evaluations, requires that much of the training assessed be in the form of on-the-Job training (OJT). This in turn, requires the development and delivery of easy to use, highly customizable, easily replicable, supervisor-friendly, and easily (or electronically) administered, observation and reporting tools. The use of PDA’s such as the Palm™ has proven to be effective, easy to learn and trouble-free. Performance evaluation data collected on the PDA can be loaded to evaluation team laptop computers for review and verification that all personnel have tested. Once all data is reviewed, it can be uploaded to a web-based application for merging/and or reporting with other test data to determine certification. Reverse pathing allows downloading to each PDA of authorized personnel to be tested. Embedding of PDA identifiers enables association of test scores with specific raters. This enables subsequent conduct of Inter-rater reliability testing using Inter-Rater Kappa coefficients.

Level 3–Performance or Training Transfer data requires focus on, and evaluation of, on-the-job behavioral and performance changes as a result of the training event. Level 3 data is usually collected between 3 and 6 months after the training event and involves both students’ and managers’ evaluation of changes in job related activities. The dispersed and (in many cases) non-automated nature of many government field sites requires that Level 3 performance be observable, assessable and collectable via paper, electronic or telephonic data collection means, as well as web-based, or email-collected data input.

Level 4–Impact on Organizational Performance Indicators could consist of an automated wizard for the collection of measurement data at selected for the conduct of Level 4 analyses.

Level 5–Return on Investment in Training (Training Worth) A wizard with access to the measurement data collected to measure impact on Organizational Performance indicators in Level 4, could then assess the worth of the measured training event under varying assumptions of training impact and cost savings. These comparisons can be converted into dollar values and compared with the cost of training, to determine the worth of the training analyzed.

***

A proposed centralized, multi-level training evaluation system would initially collect Level 1 and Level 3 data through a combination of scan-able paper evaluation forms, electronic data input, telephone response and transcribed handwritten comments provided by students and instructors. The data inputs provided to the Contractor would be stored in a relational database and reports would be generated and distributed via email, secure Internet web site and other means as required. Various roll-up and management reports would be created as required.

An automated Level 2 component for building, administering, scoring and recording the results of valid and reliable tests could be built and put in place. The system would initially track/assess the process of isolating job-related tasks and objectives, provide a facility for building and administering tests to pilot audiences and to test populations, and receive test results via scan-able forms, email, uploads from hand-held devices, and web-based input. The proposed system could score the test automatically using a pre-entered key and provide results to the individual being tested and to the instructor. Rollup reports would be provided to management. The system would include automated facilities for measurement of test validity (include Item Analysis) and for measuring the reliability of test results

The Level 4 component could collect measurement data from organizations which have achieved a critical mass of individuals completing the Level 2 and Level 3 stage gates.

A Level 5 component could record the differences in value of various organizational performance indicators, both before and after the measured training event, and convert these differences to dollar values, to determine the worth of the training, by Organizational Performance Indicator.