- Introduction to Standard Setting
- How to Set up a Standard Setting Meeting
- Different Standard Setting Techniques: Absolute and Compromise Methods
- Associated Legal Issues
Here we revisit our cyclical model of examination development. Once an examination has been assembled, we need to set a standard to determine the appropriate level of difficulty.
Recall your days in school when teachers set the universal "70%" as the minimum score needed in order to pass an exam. Although this percentage may be sufficient for a grade school science class, using this or any passing score without sound, research-backed methodology is the easiest way to invite legal challenges to your exam program. For all intensive purposes, 70% is just an arbitrary number. It is unlikely that your teachers were psychometrically determining that each test administered was equal in difficulty.
Standard Setting: The Basics
The purpose of a standard setting is to determine the exact point that separates those candidates who know the material and those who do not. We want to be confident that good performance on an examination will correlate to high competence on a job, for example. This is a difficult process and is open to much legal scrutiny. In fact, it is one of the most common legal challenges for employment exams.
Norm-Referenced vs. Criterion-Referenced Examinations
Tests can be categorized into two major groups: norm-referenced tests and criterion-referenced tests. These two types differ in their intended purposes, the way in which content is selected, and the scoring process which defines how the test results must be interpreted.
A norm-referenced test is a type of test, assessment, or evaluation which yields an estimate of the position of the tested individual in a predefined population, with respect to the trait being measured. This estimate is derived from the analysis of test scores and possibly other relevant data from a sample drawn from the population. The term normative assessment refers to the process of comparing one test-taker to his or her peers. Norm referenced examinations are designed to highlight achievement differences between and among students to produce a dependable rank order across a continuum of achievement from high achievers to low achievers. Schools might want to classify students in this way so that they can be properly placed in remedial or gifted programs. For this purpose, they are fine, but they are not appropriate for employment examinations.
On the contrary, a criterion-referenced test is one that provides for translating test scores into a statement about the behavior to be expected of a person with that score or their relationship to a specified subject matter. The objective is simply to see whether the student has learned the material. Criterion-referenced exams report how well candidates are doing relative to a pre-determined performance level on a specified set of educational goals or outcomes included in the school, district, or state curriculum.
Norm-referenced methods are based on a comparison among the performances of examinees. Using this method, a set proportion of candidates fails regardless of how well they perform. For example, the top 70% pass. Here is a sample test score distribution that applies the norm referenced method. The red shaded portion that illustrates the lowest 30% of candidates are the pre-determined cutoff point.
Criterion Referenced exams are used to determine whether candidates know the subject matter or they do not. Here is a test score distribution using the criterion-referenced method. You can see the exact intersection of the group that performed well and the group that performed poorly. This is designed to be the point of minimum competence.
When organizations set a passing standard for their exam, there are several important factors they consider:
What is the difficulty of the exam?
Subject matter experts developing a high-stakes exam usually write questions with a wide range of difficulty, from very easy to very difficult. As such, it is advisable to document a quantitative evaluation of the exam difficulty, even if it is based on opinions from subject matter experts.
What is the minimum level of performance that can be considered "competent"?
Observations of employees on the job will typically yield a range of performances, from exceptional (the superstars) to inadequate (the incompetent). The goal is to establish the minimum set of knowledge and skills someone would need to possess in order to be labeled "competent" in regards to the job.
What is the experience level of the target population with the material being tested?
Another consideration is the experience those who will be taking the exam have with the exam content. If the exam contains concepts recently identified as important due to a change in job role or expectations, the target population may not have had the experience to effectively master them yet. As a result, the passing rate may be unreasonably low until proper training has occurred. Therefore, it is important to be aware of any new knowledge or skills being tested that would limit the performance of the target population due to experience or exposure.
What is the market demand for practitioners in the organization, industry, or public?
A passing standard set too high will produce too few practitioners to adequately meet the needs of the organization, industry, or public. If too few people are meeting the performance standard, it may be advisable to revisit its appropriateness in regards to the exam’s objectives.