Wiliam (1996) offers a model that starts from Messick’s four-facet model (1) of validity (subsequently, (1996), enhanced to six facets) and applies it the National Curriculum. Wiliam’s analysis has much to offer when looking at assessment at 16. He takes Messick’s distinction of the evidential and consequential in assessment and adds Moss’s (1992) interpretative basis to the former. Assessment validity needs to be looked at through the evidence, the interpretation and the impact (consequence). For each of these two bases – evidential/interpretive and consequential – Wiliam then builds on Messick’s other dimension of within- and beyond-domain.
Wiliam then examines each of the four zones in turn.
In regard of within-domain inferences Wiliam explains the work of Popham and others in trying to establish valid tests that test all, and only, the domain that is intended to be tested. The concluding criticism of the validity NC tests may well apply to any external traditional examination – they are unrepresentative of the domain because of their length compared to the length/volume of learning.
For beyond-domain inferences Wiliam cites the predictive nature of the use of test results. High performance in X predicts high performance in Y. He cites Guilford in saying that it doesn’t matter how this correlation is arrived at, merely that it is reliable. The test might not be valid though as it may not be in the same domain. For ICT at 16 there may be aspects of the achievement that is given far greater importance than maybe it should. A learner gets Key Skills level 2 in ICT (2) therefore s/he is functionally literate in ICT. It doesn’t matter how the level 2 was achieved.
Within-domain impact is of particular importance to the design of ICT assessments, I believe. Hence the move towards onscreen testing – it’s ICT so the the technology must be used to assess the capability. In Wiliam’s words, it “must look right” (p132).
Finally, Wiliam considers beyond-domain impact or consequence. In looking at National Curriculum testing, Wiliam argues, some of the validity is driven (or driven away) by beyond-domain impacts such as league tables – these are much higher stakes for schools than learners and so the validity of the assessment is corrupted.
(1) Messick, “Validity,” 20; Lorrie A. Shepard, “Evaluating Test Validity,” in Review of Educational Research, ed. Linda Darling-Hammond (Washington, DC: AERA, 1993), 405-50. cited in Orton (1994)
(2) The functional/key skill component of ICT learning is referred to as IT
10/01/07 Post on Embretson (1983)
11/01/07 Post on Moss (1992)