Introduction to Training Evaluation

Introduction to Training Evaluation

Why should organizations evaluate TD activities?  There are several reasons.

First, training is expensive.  According to Training Magazine’s 2017 Training Industry Report, U.S. companies spent approximately $91 billion on training related expenses.  Like all expenditures in organizations, there is a need to know that the money was spent wisely.  Second, managers and trainers need to know that the training objectives were achieved. Third, organizations want to know if their processes are effective and efficient.  Regarding TD processes, they want to know if:   a) if the training needs were diagnosed correctly in the TNA, and b) if the training design and implementation processes met employee needs.  Like other organizational leaders, TD leaders are expected to manage their processes effectively and efficiently.

Organizations gather data and measure activities and processes that are important to their success, including sales, expenses, inventory, headcount, customer complaints, and much more.  It is an expectation of doing business.  If TD professionals want to be business partners, they need to be able to measure the impact of their activities on the business like their line department counterparts.

There are several ways that training departments evaluate and report training information.  They include metrics like:

· Number of people trained in a given period

· Hours per year that people spend in training

· Training courses completed

· Participant satisfaction with the training (end of course surveys)

· Individual mastery of the material (end of course test scores)

· Training budget and expenses

However, most of this data focuses on training activity rather than training effectiveness. Establishing metrics and gathering meaningful data that links training to increased productivity or return on investment (ROI) has traditionally been a roadblock for trainers.  Increasing use of the Kirkpatrick Four level and Phillips Five level evaluation models have helped trainers become more evaluation and impact focused in organizations.

Formative and Summative Evaluation:

Trainers can conduct two types of evaluation – formative and summative.  The formative evaluation takes place during the design and development of the training.  The most commonly used type of formative evaluation is pilot testing.  ‘Piloting’ the training is reviewing the training with a group of potential trainees to get their input on the program.  The trainees share feedback on the program with the trainer and the trainer adjusts the training based on the feedback.  Other stakeholders, such as managers, customers, and subject matter experts (SMEs) may also be asked to participate in formative evaluations to assess the content of and methods used in training programs.

Summative evaluations are usually conducted during the implementation phase or after the training concludes.  They involve collecting data to determine if the trainees are learning or transferring their learning by changing their behavior back on the job.  As we mentioned last week, the transfer of learning is an important consideration when trainers design, develop, and present the training.  Everything is linked in the ADDIE process.  The TNA (need assessment) drives the objectives for the training.  The objectives drive the design, development, and implementation of the training.  The evaluation asks – did we achieve our objectives or satisfy the individual or organizational need for training?

We will focus the rest of the lecture on summative evaluation models.   

Summative Evaluation Models: Kirkpatrick and Phillips

Perhaps the best known evaluation model was created by Donald Kirkpatrick.  Kirkpatrick, a professor at the University of Wisconsin, originally published his work in 1959 in a series of ATD magazine (formerly ASTD magazine) articles.  He is best known for his 1994 work entitled Evaluating Training Programs

The book outlines his model, which is known as the Kirkpatrick Four-level Model of evaluation.  The four levels of evaluation are:

Level 1:  Reaction to Training —what was the participant’s level of satisfaction with the training experience?

Level 2:  Learning —what did the participants learn? Did they increase their knowledge, skills, or capabilities?  This could be shown using an assessment.

Level 3:  Behavior Change—what changes in behavior resulted from the participant’s application of the learning back on the job?

Level 4: Results or Impact —what was the impact on the business of the participant’s performance back on the job?

Along with Kirkpatrick, another legend in the training evaluation area is Jack J. Phillips, PhD.  Phillips published a 1997 work entitled Return on Investment.  In this work, Phillips described a fifth step called return on investment or ROI.  Phillips attempted to add a monetary value to training.  His fifth step involved a cost – benefit analysis of a training program.

1. Reaction & Planned Action – Measures participants’ reaction to the program and outlines specific plans for implementation.

2. Learning – Measures skills, knowledge, or attitude changes of the participants.

3. Application and Implementation – Measures participants’ changes in behavior on-the-job and specific application and implementation.

4. Business Impact – Measures business impact of the program.

5. Return on Investment – Measures the financial return on the organization’s investment in a training program, or the net benefits of the program divided by cost of program.

Because training in organizations is so diverse, ranging from short individual experiences to large classes, Phillips did not advocate evaluating all training.  His criteria for evaluation included programs that were:

· Highly expensive

· Strategic

· Operationally focused

· Highly visible in the organization

· Involved large target audiences

Evaluation Designs: 

Training evaluation involves collecting data to ascertain how effective the training program met the TNA.  Certainly, the Kirkpatrick and Phillips’ models are used most frequently by training practitioners.  Other evaluation designs can be used in conjunction with the two models.  For example, the posttest is a way of measuring if the participant group understood the outcomes.  It is a level two evaluation that is often done with safety, or some other type of compliance training where participants need to understand terminology or laws.  Also, it is frequently used in certification training where a certain score is necessary to obtain a certificate.

A variation of the posttest design is the pretest and posttest.  In this design, the participant is given a test before and after the training.  The trainer sometimes uses this method to determine how much knowledge was acquired during the training program.

A second variation of the posttest is the pretest/posttest with a comparison group.  In this variation, outcomes are measured before and after the training and compared with a comparison group. This type of evaluation design is often used to compare the effectiveness of different types of training methodologies.  For example, we could have one group attend leadership training in a blended classroom with an online and classroom component.  A comparison group could attend solely online.  We could compare the test scores or reaction outcomes of each group after a certain time period to determine the impact on retention of each methodology.


Like other parts of the organization, TD had to develop ways to justify their value to the organization.  TD became accustomed to use both data and evaluation models developed by Donald Kirkpatrick and Jack Phillips to show their value.  These models address the variety of training experiences that people encounter in organizations and help organizations understand the value that training adds to the individual and the organization.


Comments are closed.