Avoiding unconscious bias in replacements for cancelled GCSE and A-Level exams

GCSE and A-level exams are cancelled for summer 2020 as a result of the Covid-19 outbreak. They’re being replaced with a form of teacher assessment.

Ofqual have decided this is the fairest system possible under the circumstances. However, there is reason for concern. Evidence shows that teacher assessments disadvantages certain ethnic minority and socio-economic groups compared to exams.

It is vital that we understand the impact of unconscious bias and do everything we can to limit it in the new system, for example checking grades awarded against the results of national reference tests.

How students who can no longer sit GCSEs and A-levels this summer will get their grades

Schools and colleges are being asked by Ofqual to gather all available evidence for how each pupil would have done in their exams this summer. This includes school and college records, mock exams, and non-exam assessment (class and homework assignments) that a student has done. From this evidence, teachers will predict the grade a pupil would have achieved.

Next, they will rank each pupil relative to the other pupils with the same predicted grade. For example, they rank the 8 students who would have got a grade B from 1 (the most likely to get the grade) to 8 (the least likely to get the grade).

Ofqual have opened a consultation for two weeks exploring issues such as how grades will be standardised and how appeals will work. Following this, Ofqual will release detailed guidance for schools.

This system raises some key issues that will need to be covered in any guidance.

Ranking pupils within grades may be particularly challenging for schools with large cohorts, including Further Education Colleges. It may present more challenges for certain subject teachers e.g. those in the creative arts.

It is very likely that the proposed assessment system will lead to significant workload for teachers, which may limit the time they have available to set work for pupils to do from home and support them in completing that work.

In addition to this, it is likely that they will be under pressure both from within school and from pupils and their parents to justify the grades that they allocate.

Most importantly, the guidance and the system must do everything it can to deal with the biggest problem with teacher-led assessment – unconscious bias.

Unconscious bias in teacher-led assessment

Ofqual note that they have chosen to take this approach because it is the fairest approach available in the circumstances and draws on the extensive knowledge that teachers have of their pupils.

But there is evidence that where teachers know the identity of the pupil when marking, some groups of students fare better and some worse.

We refer to external examiners as ‘blind’ because the person marking doesn’t have any pre-existing expectations of how that student will do in the exam. Teachers who know the students will have those expectations, and they might be influenced by unconscious biases they hold.

Unconscious bias, bias because of attitudes we don’t know we have, isn’t specific to teachers. Most of us have unconscious bias – the majority of people sitting a common unconscious bias test called the Implicit Association Test are shown to hold some sort of bias.

What difference can ‘blind’ assessment make? In a study from Bristol University’s Professor Simon Burgess and Ellen Greaves, the results of 11-year-old pupils in ‘non-blind’ internal assessment were compared with their outcomes in ‘blind’ exams. They found that pupils of Black Caribbean, Pakistani or Bangladeshi ethnicity are more likely to score comparatively lower in subjective teacher assessment than white pupils. The study shows a similar pattern for pupils eligible for free school meals (children from households eligible for income-related benefits).

Bias works both ways

In the Burgess and Greaves study, unconscious biases did not play out in the same way across all subjects and were sometimes favourable for certain ethnic minorities.

For example, for Chinese pupils, the difference in scores between teacher and external assessment in English is similar to that of white pupils. However, in mathematics, Chinese pupils are significantly less likely than white pupils to be scored lower by their teacher than an external test. In this case, it looks like teachers' expectations that Chinese pupils will perform highly in mathematics influences how they mark assessments.

The study finds that when judging performance, teachers use information about the performance of pupils from the same demographic group in that school in previous years.

The authors note that if pupils from a certain group are aware of being undermarked, this may lead to disengagement from learning and reduced ‘effort’, creating a vicious cycle of academic underachievement. If early assessments are used to inform scores later in a pupil’s school career, the performance gap between ethnic minority groups – which is already pronounced – is only likely to further widen.

Teacher bias manifests in other ways too. Research on setting and streaming from Professor Paul Connolly, Dr Becky Taylor, Professor Becky Francis and others shows that teachers misallocate pupils from some minority ethnic groups. For example, Black pupils were 2.4 times more likely to be misallocated into lower sets than white pupils. The study also finds gender disparity in setting practices. Girls were 1.5 times more likely to be misallocated to lower sets in maths than boys.

Studies in favour of more teacher-led assessments

These studies demonstrate a strong rationale for using more objective data in pupil assessment. Typically, that objective data comes from exams. However, there has been rising concern about high stakes exams incentivising teaching to the test and negatively affecting pupils’ mental wellbeing.

That’s one of a few reasons that people have been arguing for less exams and more teacher assessments.

Research commissioned by the Department for Education looking at how to curb the teacher recruitment and retention crisis found that teachers want more autonomy and recommended that they are given more freedom over planning and marking. So, the theory goes that less exams and more teacher assessments could help schools keep teachers.

A 2019 study from Kings College London argued for less exams and more teacher assessments. The authors compared final test scores to teacher predictions and past exam scores. Exam scores were the most accurate way to predict future performance, but the results also showed that teacher assessments were 90% accurate.

The authors argued that gap was because of exam effects like nerves and lack of confidence. They argued teacher assessments would allow teachers to regularly monitor pupil’s progress, spot where students encounter difficulties, and adapt their teaching accordingly.

How to limit the impact of unconscious bias when determining grades

There are studies in favour and against more teacher assessment, but there is a consensus that teacher assessment leads to some bias compared with external, ‘blind’ assessment. The evidence also suggests that some groups are systematically disadvantaged.

So, if teacher assessment is going to take the lead this year, it’s vital that we do everything we can to counter biases.

One suggestion comes from John Jerrim at the Institute of Education who has explored whether we can use the National Reference Test (NRT) results to measure the extent of bias in teacher assessment and compensate accordingly.

In February and March each year, 14,000 pupils from randomly selected schools participate in the National Reference Test, which is a test of English and maths skills.

If teachers from these schools were to complete their assessments of pupils’ likely performance in GCSEs as soon as possible, these results could be compared to the results these pupils gained in the NRT they sat just before lockdown began, and in previous tests.

An analysis by ethnic group, for example, could reveal whether some groups of pupils are unintentionally undermarked by teachers compared with their national reference test scores. If this were found to be the case, grades could be moderated accordingly to ensure that pupils are not unnecessarily disadvantaged by the unexpected change in assessment system.

It is crucial that the question of unconscious bias is tackled if we are to limit the long-term knock-on effects on the educational and life changes of the most disadvantaged pupils.

Find out more about the measures the RSA is suggesting to ensure that all children have access to quality learning experiences over the coming months.

Join the discussion

Comments

Please login to post a comment or reply

Don't have an account? Click here to register.

Amarjit Bassi

13 March 2021

Aside from unconscious bias due to ethnic group, there could be other reasons ? The argument that 'teachers know their students so would be a fairer way to allocated grades' is not indisputable. What about cases where a student in year 10 wasn't putting in the effort and hence grades were not great > the teachers have a 'view' of them. Fast-forward to Year 11 and the student has put in a lot of extra work during lockdown. Is the teacher really going to believe that they can jump 3 or 4 grades ? Are they going to able to put aside any pre-conceived notions and give them a fair grade ? That is where independent exams are fairer and fully impartial.
Andrew Houghton

26 May 2020

I'd be interested to know whether the ethnicity/gender of the teacher was considered. Is there any difference in their assessments. Also, the IAT has had a lot of criticism as it performs poorly, and doesn't correspond to real life actions.
Here's a video we made about unconscious bias, it covers some of the things you mention above https://youtu.be/RhqMEiTVICU Hope you like it.

Creative responses to climate change

Watch our event videos

Become a Fellow and contribute

Make it authentic

The triple benefit of youth social action

Become an RSA Fellow

Over 260 years of impact

RSA House venue hire