There is still an argument about whether school league tables, despite
their well-known side effects, actually improve the performance of
pupils by ‘holding schools to account’. This is despite careful analysis
of extensive data collected by Government on all pupils in England via
the National Pupil Dataset (NPD) and details can be found at http://cmm/team/hg/full-publications/2011/predicting-probabilities-school-choice.pdf
. This research, the latest in a sequence, shows that the uncertainty
surrounding value added scores for schools is not only so large that
most schools cannot reliably be separated, it is also of hardly any use
for parents choosing schools for their children in terms of predicting
school performance. Yet, still there are reports that claim to
demonstrate that public league tables do in fact raise pupil
performance. So, despite the fact that they appear to be unreliable
measures of real school performance, they nevertheless, somehow, manage
to improve overall results. We need, therefore, to take the claims
seriously, especially when they are published in peer refereed journals.
One of the more recent ones to appear by Professor Simon Burgess and
colleagues (http://www.sciencedirect.com/science/article/pii/S0047272713001291
) compares public examination scores trends over time in Wales, where
league tables were dropped in 2001, and England which has continued to
publish them. The authors conclude that “If uniform national test
results exist, publishing these in a locally comparative format appears
to be an extremely cost-effective policy for raising attainment and
reducing inequalities in attainment.” Of course, schools are about much
more than test and exam scores, and this does need to be borne in mind.
Essentially what the authors do is to compare the difference in GCSE
results between Wales and England over the period 2002 – 2008 and show
that the difference between England and wales increases steadily over
time. This compares with the period before 2002 when there were no
differential trends over time. The authors are careful to try to rule
out causes other than the abolition of league tables in Wales for this
trend, testing a series of assumptions and using a series of carefully
constructed and complex statistical models, but are left concluding that
it is indeed the abolition of the league tables that has placed Wales
at an increasing disadvantage compared to England. As the authors
recognise, a major problem with studies of this kind, sometimes rather
misleadingly (as is done here) referred to as ‘natural experiments’,
that attempts to infer causation from correlated trends across time is
that there are so many things that are also changing over the period
that one can never be sure that important alternative explanations have
not been ruled out. In this note I want to suggest that there are indeed
alternative explanations other than the abolition of league tables for
this increasing difference and that the authors’ conclusion as quoted
above simply is not justified by the evidence. First let me deal with a
few ‘technical’ issues that have been overlooked by the authors. A basic
assumption when public examination scores are used for comparisons is
that there is an equivalence of marking and grading standards across
different examination boards, or at least in this case that any
differences are constant over time. Yet this is problematic. When the
league tables and associated key stage testing was abolished in Wales,
there was no longer any satisfactory way that such common tests could be
used to establish and monitor exam standards in Wales where most pupils
sat the Welsh Joint Examination Council (WJEC) exams compared to
England where only a small minority took the WJEC exams. There is
therefore some concern that comparability may have changed over time.
The Authors of the paper unfortunately, do not take this problem very
seriously and merely state that “The National Qualifications Framework
ensured that qualifications attained by pupils across the countries were
comparable during this period”
One of the ways in which the
authors might have tested out their ‘causal’ hypothesis is by dividing
England into regions and studying comparative trends in each of these,
in order to see if in fact Wales really was different, but this does not
seem to have occurred to them. The major omission in the paper,
however, is that the authors fail to mention that at the same time as
the league tables were stopped, so was the testing and because the
pupils became less exposed to the tests, they were arguably less
well-equipped for the examinations too. This is, admittedly, somewhat
speculative, but we do know that in part the ability to do well on tests
is strongly related to the amount of practice that pupils have been
given, and it would be somewhat surprising if this did not also extend
to the public exams. Interestingly, when piloting for the reintroduction
of regular testing in Wales took place in 2012, there was evidence that
the performance of pupils had deteriorated as a result of not being
tested intensively during their schooling. It has also been suggested
that this lack of exposure to testing is associated with a relative
decline in PISA test scores. So here we have a very plausible mechanism,
ignored by the authors, that, if you believe it to be real, explains
the relative Welsh decline in exam results. It may be nothing to do with
publishing league tables, but rather to the lack of test practice. Of
course, if this is in fact the case it may have useful implications for
schools in terms of how they prepare pupils for exams. I would argue,
therefore, that we should not take seriously these claims for the
efficacy of league tables. I believe that there is no strong evidence
that their publication in any way enhances pupil performance and
furthermore that their well-understood drawbacks remain a powerful
argument against their use.