Validity and reliability are two important ideas that must be considered when thinking about assessment in any context, and especially in learning environments. An assessment is considered valid if it directly measures the product of the stated outcomes, or, stated another way, if it measures whether learners have learned what we wanted them to learn. For example, after completing a course on painting using water colours, a valid assessment might be to show the different ways to layer colour to create texture. It would not be a valid assessment of the water colour students to have them arm wrestle! This is why writing clear outcomes is so important. Conrad and Openo (2018, p. 65) make note of Yogi Berra’s famous quote here:
If you don’t know where you’re going, you might end up somewhere else.
More specifically, Finch and French (2019) explain that validity is about whether the evidence we have gathered supports a particular inference about what a learner has achieved. For instance, if we discover that learners who perform well on a licensing assessment for accountants make fewer mistakes in their accounting practice than learners who perform poorly on that assessment, then we can say that the assessment is valid. Our inference (that the learner is qualified to be an accountant) is supported by the evidence we gathered (their performance on the assessment), so, it isn’t so much that the assessment is valid, but that our inference is valid.
An assessment is considered reliable if the results are considered accurate. This usually means that given the same learning conditions and activities, repeating an assessment would produce the same results. Finch and French (2019) talk about consistency of results between repeated assessments without the learners remembering the previous assessments. Repeated assessments are often impractical for determining reliability because learners will remember their mistakes from the previous assessments and correct them on subsequent attempts, and they may also become fatigued (these two factors wold have opposite effects on reliability).
Reliability and validity can be complicated by many different factors. If learning conditions change in some way, or there are external factors impacting your learners, it will be difficult to know if your assessment is reliable unless you provide opportunities for retesting. Validity of assessments can be impacted by the assessment tool you use, how you design your assessment, and even the way in which any questions are asked.
Our assessment protocols tend to speak loudly about that which we value most. If work habits and compliance are held as the highest standards, we tend to include marks for meeting or demerits for missing due dates.
Effort is another area that appears at times on assessment data. How is this assessed? Everyone can learn but the effort may look different. How will you adjust your assessment to reflect the data you collect? Does multiple attempts at mastery (more effort) mean lower marks because it took longer to achieve mastery? Is a task that is completed quickly by someone with very little effort rewarded because it arrived early (valuing work habits?) Effort is not something that can readily be assessed. A designer may include reflective questions for the learner to consider their work over the course of the learning experience to prompt the learner to set goals for their next learning experience.
If a designer chooses to include a pre-assessment before embarking on their planned instructional approach, they may have a better understanding of the prior knowledge of their learners. This could inform the designer about misconceptions to address, outcomes that will need adjusting, and outcomes that are already met. The designer may also face the conundrum of managing the data of knowing that their learners are in ten different places in their understanding!
Types of Assessment
There are almost unlimited options for how we assess learning. The important consideration is if the assessment piece aligns with the outcomes. If the outcome is to “be able to” do a particular task, the assessment should be testing the learner’s ability to do that task. Match the language of the assessment with the verbs used in the outcomes. Using language from Bloom’s Taxonomy can assist a designer with this.