Investigating validity and reliability in GCSE maths exams

Questions in GCSE mathematics examinations are often very short and isolated, because there is a high demand for questions that are easy to mark and give reliable results. However, there is concern that this kind of exam question encourages students to tackle each problem with a familiar step-by-step approach, at the cost of a deeper, conceptual understanding and appreciation of mathematics.

The alternative paired-comparisons approach requires examiners to make impressionistic, holistic decisions about which of two pieces of students' work is ‘better’ according to an agreed criterion. Each piece of work is compared with many others until a full ranking is achieved. There is evidence that this approach can improve validity and reliability in subjects other than mathematics.

This project tested the robustness of the paired comparisons approach for GCSE maths exams, compared to traditional exam marking. The researchers conducted two day-long workshops and a small follow-up study. 23 mathematics education professionals (including ten GCSE examiners) attended the workshops and used paired comparisons to rank order GCSE scripts and more open-ended scripts.

The outcome was a rank order of scripts that correlated strongly with grades, and across the two workshops. In summary, the researchers found that the paired comparisons approach worked with existing GCSE scripts for which it was not designed, and also worked for more open-ended tasks (Bowland, Functional Skills) designed to increase the validity of GCSE mathematics assessments.

Dr Jones is now leading a further Nuffield-funded project, exploring whether this approach can be scaled up cost-effectively and whether it can cope with poor pupil performance.

Project details



Dr Ian Jones, University of Nottingham

Funding programme


Grant amount and duration


1 September 2010 - 31 March 2011