|
Automated Multi-Level Training Assessment Programs
D.L. Kirkpatrick Model
I. BACKGROUND
In order to achieve training excellence through
continuous improvement, thus shortening the time between hire and
the development of operationally acceptable, mission critical competencies,
government agencies recognize the critical need for comprehensive
and timely assessment of training. Proper evaluation of training
helps ensure that trainers, site training supervisors, managers,
and executives are able to make informed decisions about training.
These decisions impact the effectiveness of courses and their delivery,
as well as the degree to which courses / curricula increase knowledge,
enhance skills, and build competencies.
Additionally, The Federal Workforce Flexibility
Act of 2004 requires Federal agencies to regularly evaluate and
modify training programs or plans in order to promote a more strategic
approach to agencies’ integration of training plans into overall
mission accomplishment, and to provide specific training to develop
managers as part of a comprehensive management succession program.
The Act adds the requirement that each agency, on a regular basis,
evaluate each of its training plans or programs as to how
that plan or program accomplishes or effectively promotes the agency’s
specific performance plans and strategic goals. Recent forays by
the General Accountability Office have disclosed specific interest
on the part of auditors in agencies’ compliance with the provisions
of GAO publication GAO-04-546-G, “A Guide to Assessing Strategic
Training and Development efforts in the Federal Government”,
a significant portion of which deals with an agency’s ability
to demonstrate how training and development efforts contribute to
improved performance and results.
II. A TRAINING EVALUATION MODEL
The training evaluation model described in this
White Paper is based on the work of D.L. Kirkpatrick. This model—a
four-level, interdependent hierarchy of levels of assessment—is
widely used in government and private industry. Pioneering efforts
by Jack Phillips have resulted in acceptance of a 5th level–ROI
calculation–frequently being mentioned as a goal as well.
Thorough analysis of training after it is conducted
is the key to identifying actions required to improve training outcomes.
This analysis should focus on all aspects of the training, from
Training Delivery, to Organizational Results or Impact, following
the 4 evaluation levels specified in the Kirkpatrick Training Evaluation
Model, plus an additional level centered around determining the
cost-effectiveness of training administered to an organization.
Level 1 (Reaction)
Evaluation requires that data be collected as soon as possible after
the training is conducted. The best results are obtained when the
trainee – as an integral part of the course of instruction
– is allowed or required (depending on the union or regulatory
environment) to complete a carefully designed training evaluation
form, either manually or on a workstation. The latter is particularly
well suited to automated course administration using computer-resident
or web-resident software. Data collected are valuable to all parts
of the education community and its customers. Customers/clients,
course deliverers and developers - all contribute to effective training
and thus have a need to review and monitor evaluation data received.
Managers and executives also contribute to the data analysis process
and to implementing outcome improvements.
Level 2 (Achievement)
Evaluation requires the use of testing to determine the extent to
which trainees changed attitudes, improved knowledge or increased
skills. This evaluation requires that procedures for test creation,
validation, administration and scoring be in place, to assure that
tests administered are in conformity with accepted principles of
test validation, are administered in a consistent manner, and produce
reliable results.
Level 3 (Performance)
Evaluation determines the extent to which the training received
has transferred to the workplace. It requires surveying the trainee
and manager population that has undergone training, to determine
the extent to which changes in behavior and job performance –
competency improvements – have occurred as a result of the
training.
Level 4 (Organizational Impact)
Evaluation measures mission-related outcomes indicative of training
success (i.e., percentage of customer calls completed within 2 minutes
of call-pickup), as a result of organization members having attended
a training program.
Level 5 (Return on Investment)
Evaluation, while not a Kirkpatrick-devised level of evaluation,
is being used in private industry and considered for use in government.
However, in the absence of objective financial metrics in the government
environment, alternative non-financial approaches are also being
considered.
III. AN APPROACH TO AUTOMATED TRAINING EVALUATION
The automated evaluation program described in
this document consists of a centralized training evaluation process
based on collecting Level 1 Reaction data from class evaluations;
Level 2, Achievement test results from paper or online
tests administered throughout the country or On-the-Job Training
Daily Observation Reports from Training Officers, and Level 3, Performance
survey data from training courses. Level 2 and 3 data, together
with measured differences in organizational performance indicators,
can then be used to measure the value and worth of training (Level
4).
Level 1–Reaction data typically
includes “learnability” data collected from trainees
(e.g., learning preferences, instructional delivery, job-relatedness,
training efficacy, and instructor competence) and ‘teachability
‘ data collected from instructors (e.g., adequacy of materials
and the instructional environment, preparedness of trainees) as
well as other reactions to the training event and its delivery.
As many of the training sites are/will be remote from the headquarters
training location, much of the training will be delivered via computer
media, either off line or Internet. This, in turn, requires that
reaction data be collectable by a variety of means; including paper
scan forms, electronically distributed surveys, telephone data entry,
or access and completion of online training assessment forms.
Level 2–Achievement data is collected
from validated tests administered to determine the degree to which
trainees changed attitudes, improved knowledge or increased skills.
Implementation of this level of assessment requires that procedures
for test creation, validation, administration and scoring be in
place, to assure that tests administered are in conformity with
accepted principles of test validation, are administered in a consistent
manner, and are scored reliably and consistently, and the results
collected for use in curriculum improvement, training program assessment
and human resources decision making. The highly dispersed and repetitive
nature of many of the behaviors and procedures involved in conducting
field evaluations, requires that much of the training assessed be
in the form of on-the-Job training (OJT). This in turn, requires
the development and delivery of easy to use, highly customizable,
easily replicable, supervisor-friendly, and easily (or electronically)
administered, observation and reporting tools. The use of PDA’s
such as the Palm™ has proven to be effective, easy to learn
and trouble-free. Performance evaluation data collected on the PDA
can be loaded to evaluation team laptop computers for review and
verification that all personnel have tested. Once all data is reviewed,
it can be uploaded to a web-based application for merging/and or
reporting with other test data to determine certification. Reverse
pathing allows downloading to each PDA of authorized personnel to
be tested. Embedding of PDA identifiers enables association of test
scores with specific raters. This enables subsequent conduct of
Inter-rater reliability testing using Inter-Rater Kappa coefficients.
Level 3–Performance or Training Transfer
data requires focus on, and evaluation of, on-the-job behavioral
and performance changes as a result of the training event. Level
3 data is usually collected between 3 and 6 months after the training
event and involves both students’ and managers’ evaluation
of changes in job related activities. The dispersed and (in many
cases) non-automated nature of many government field sites requires
that Level 3 performance be observable, assessable and collectable
via paper, electronic or telephonic data collection means, as well
as web-based, or email-collected data input.
Level 4–Impact on Organizational Performance
Indicators could consist of an automated wizard for the collection
of measurement data at selected for the conduct of Level 4 analyses.
Level 5–Return on Investment in Training
(Training Worth) A wizard with access to the measurement data
collected to measure impact on Organizational Performance indicators
in Level 4, could then assess the worth of the measured training
event under varying assumptions of training impact and cost savings.
These comparisons can be converted into dollar values and compared
with the cost of training, to determine the worth of the training
analyzed.
***
A proposed centralized, multi-level training evaluation
system would initially collect Level 1 and Level 3 data
through a combination of scan-able paper evaluation forms, electronic
data input, telephone response and transcribed handwritten comments
provided by students and instructors. The data inputs provided to
the Contractor would be stored in a relational database and reports
would be generated and distributed via email, secure Internet web
site and other means as required. Various roll-up and management
reports would be created as required.
An automated Level 2 component for building,
administering, scoring and recording the results of valid and reliable
tests could be built and put in place. The system would initially
track/assess the process of isolating job-related tasks and objectives,
provide a facility for building and administering tests to pilot
audiences and to test populations, and receive test results via
scan-able forms, email, uploads from hand-held devices, and web-based
input. The proposed system could score the test automatically using
a pre-entered key and provide results to the individual being tested
and to the instructor. Rollup reports would be provided to management.
The system would include automated facilities for measurement of
test validity (include Item Analysis) and for measuring the reliability
of test results
The Level 4 component could collect measurement
data from organizations which have achieved a critical mass
of individuals completing the Level 2 and Level 3 stage gates.
A Level 5 component could record
the differences in value of various organizational performance indicators,
both before and after the measured training event,
and convert these differences to dollar values, to determine the
worth of the training, by Organizational Performance Indicator.
|