For over a decade OECD has been promoting its Programme for International Student Assessment (PISA) as a robust means of comparing the performance of educational systems in a range of countries. The latest results of tests on 15 year olds was published early in December and the British government, along with many others in Europe and elsewhere, reacted as usual with shock and horror. Welsh politicians immediately, but with no evidence, put the blame for their ‘slide down the international league table’ onto their abandonment of testing 8 years ago. Both Labour and Coalition spokespeople predictably blamed their rivals policies for England’s ‘mediocre’ performance – again with no possible evidence. What has often been termed ‘PISA Shock’, or more accurately ‘PISA Panic’, has accompanied past releases and politicians of all persuasions, in many countries, have used the ‘evidence’ about movements up and down the tables to justify changes to their own educational curriculums or assessment systems. So Finland, which consistently comes towards the top, and continues to do so, has been held up as a model to follow: if you come from the right you are likely to emphasise the ‘formality’ of the curriculum to justify ‘traditional’ curriculum approaches, and if you hail from the left you can find yourself pointing to the comprehensive nature of the Finnish system to justify reinstating comprehensivisation in England. The reality, of course, is that we simply do not know what characteristics of the Finnish system may be responsible for its performance, nor indeed, whether we should take much notice of these comparisons, given the weaknesses that I shall point out. Similar remarks apply to Shanghai whose performance is hardly believable. I don’t want to go into detail about the technical controversies that surround the PISA data. Just to say that there is an increasing literature pointing out that it is a vastly oversimplified view of what counts as performance in the areas of reading, maths and science. There is research that shows that countries cannot be ranked unequivocally along a single scale and that they differ along many dimensions. Thus, in a comparison of France and England myself and colleagues were able to show that different factors were at work in each system. This is further complicated by the way the systems are differently structured, with up to a third of pupils in French schools repeating a year at some stage, compared to very few in England. There is good evidence that the process of translating the PISA tests from one language to another is problematic so that there is no assurance that the ‘same things’ are being assessed in different educational systems. Detailed analysis of the International Adult Literacy Survey has shown how much translation can depend upon context and in many cases that it is virtually impossible to achieve comparability of difficulty for translated items. PISA does in fact attempt to eliminate items that appear to be very discrepant in terms of how pupils respond to them in different countries. The problem with this, however, is that this will tend to leave you with a kind of ‘lowest common denominator’ set of items that fails to reflect the unique characters associated with different educational systems. Most importantly, PISA is a one off ‘cross-sectional’ snapshot where each 15 year old pupil in the sample is tested at one point of time. No attempt is made (except in a few isolated countries) to relate pupil test scores to earlier test scores so that progress through the educational system can be studied. This is a severe handicap when it comes to making any ‘causal’ inferences about reasons for country differences, and in particular comparing educational systems in terms of how much they progress over time given their attainments when they start school. Often known as ‘value added’ analysis, this provides a much more secure basis for making any kind of causal attribution. OECD has in the past refused to implement any kind of ‘longitudinal’ linking of data across time for pupils, although this may be changing. PISA still talks about using the data to inform policymakers about which educational policies may be best. Yet, OECD itself points out that PISA is designed to measure not merely the results of different curricular but is a more general statement about the performance of fifteen year olds, and that such performance will be influenced by many factors outside the educational system as such, including economic and cultural ones. It is also worth pointing out that researchers who are interested in evaluating PISA claims by reanalysing the data, are severely handicapped by the fact that, apart from a small handful, it is impossible to obtain details of the tasks that are given to the pupils. These are kept ‘secure’ because, OECD argues, they may be reused for purposes of attempting to make comparisons across time. This is, in fact, a rather feeble excuse and not a procedure that is adopted in other large scale repeated surveys of performance. It offends against openness and freedom of information, and obstructs users of the data from properly understanding the nature of the results and what they actually refer to. Again, OECD has been resistant to moving on this issue. So, given all these caveats, is there anything that PISA can tell us that will justify the expense of the studies and the effort that goes into their use? The answer is perhaps a qualified yes. The efforts that have gone into studying translational issues have given insights into the difficulties of this and provided pointers to the reservations which need to be borne in mind when interpreting the results. This is not something highlighted by OECD since it would somewhat detract from the need to provide simple country rankings, but nevertheless could be valuable. The extensiveness of the data collected, including background socio-economic characteristic of the pupils and information about curriculum and schools, is impressive, and with the addition of longitudinal follow-up data could be quite valuable. What is needed, however, is a change of focus by both OECD and the governments that sign up to PISA. As a suitably enhanced research exercise devoted to understanding how different educational systems function, what are the unique characteristics of each one and how far it may be legitimate to assign any differences to particular system features, PISA has some justification. If its major function is to produce country league tables, however, it is uninformative, misleading, very expensive and difficult to justify. The best thing to do when the results are published would be for policymakers to shrug their shoulders, ignore the simplistic comparisons that the media will undoubtedly make, and try to work towards making PISA, and other similar studies, such as TIMSS, more useful and better value for money.