Concerns about UK grade inflation are now such a regular feature in the media that most of us are probably in the terminal stages of grade inflation fatigue. It is also the case that the term itself is interpreted in two different ways leading to two (intertwined) debates where both require a different policy response.
To this end, it might help to distinguish between grade inflation and classification inflation. In the process we can also look at one issue not being debated, namely how fairly (or otherwise) UK universities calculate their degree classifications.
Mark or grade inflation
A US student would call it their “grade” for a module; but the UK student uses the term “mark”. Fortunately, the interpretation of inflation is near universal. Thus, grade/mark inflation is the tendency of academic grades/marks for work of comparable quality to increase over time. However, in the UK what is being reported as “grade inflation” cannot be interpreted as mark inflation because we are not talking about changes in marks. The reported issue is around degree classifications awarded to students.
Defining classification inflation
Classification inflation, on the other hand is, the tendency for the number of higher degree classifications (1st and 2:1 in the UK) to increase over time independent of any increases in the marks underpinning those classifications.
Much like the US Grade Point Average (GPA), the UK degree classification is not a “mark”: it is an average of marks. If a large number of UK universities change their degree algorithms, the proportion of 1st and 2:1s will change independent of any change in the module marks. That is to say, there can be classification inflation without any grade/mark inflation.
Alternatively, if over time that average is calculated in a consistent manner across all UK HE institutions then any increase in the number of 1st and 2:1s awarded could be taken as a reasonable proxy for mark inflation. In the US context, the widespread use of the GPA to summarise a student’s overall achievements generally meets this consistency criteria. When our US counterparts talk of grade inflation they are indeed referring to an inflation in the marks being awarded to students – marks that then inflate the calculated GPAs.
UK degree Algorithms
In the UK however there is no such consistency. The autonomy granted to UK universities means that for each university a 1st or 2:1 is what they say it is. The result is significant differences in the degree algorithms used by UK universities. These generally differ in three ways:
- The number of years used in the calculation
- The weightings given to those years (or differential weighting)
- Whether lower marks are ignored (or discounted)
Bear in mind that UK students typically study 120 credits per year where the credits are evenly split between different modules For example, a student might study six 20 credit modules in each year. Therefore, those universities that that use discounting will only use the marks for the best 100 credits – or best 90 credits – in each year of study.
The table below lists the algorithms for 41 randomly selected universities. While the table cannot be taken as representative of the sector wide distribution it does illustrate this diversity in degree algorithms.
Source: Allen (2018)
We can immediately see the complexity of these different algorithms. In truth, most lecturers – let alone students – do not understand their own institutional degree algorithm. The vast majority of universities use only year two and three marks. The justification for the higher weighting on the last year is that it captures the student’s “exit velocity” or the standard that the student is performing at as they graduate from university.
The impact of this diversity in degree calculations
The impact on the classifications awarded is dramatic and best explained using a worked example. The marks for student X and Y are below, where both study on degrees comprised of 18 modules each one worth 20 credits (this makes the math a bit easier).
STUDENT X | |||||||||||
First Year Marks | Second Year Marks | Third Year Marks | Degree Outcome from 7 different algorithms | ||||||||
Module | MARK | Module | MARK | Module | MARK | Years used | Y1,Y2,Y3 | Y2 and Y3 only | |||
Module 1 | 54 | Module 7 | 43 | Module 13 | 66 | Discounting | No discounting | 20 credits discounted | |||
Module 2 | 57 | Module 8 | 47 | Module 14 | 71 | University | A | B | D | E | F |
Module 3 | 63 | Module 9 | 50 | Module 15 | 74 | Algorithm | 1 | 3 | 8 | 9 | 14 |
Module 4 | 65 | Module 10 | 56 | Module 16 | 75 | Total credits | 360 | 240 | 240 | 200 | 200 |
Module 5 | 72 | Module 11 | 60 | Module 17 | 75 | Weighting [Y2 : Y3] | EQUAL | 50 - 50 | 20 - 80 | 50 - 50 | 20 - 80 |
Module 6 | 80 | Module 12 | 60 | Module 18 | 81 | Degree mark | 63.8 | 63.2 | 69.5 | 64.9 | 71.1 |
Average | 65.2 | Average | 52.7 | Average | 73.7 | Classification | Low 2:1 | Low 2:1 | High 2:1 | Low 2:1 | First |
Average of best 100 credits: | 54.6 | 75.2 | Difference in Degree mark [F - A] = 7.2 percentage points | ||||||||
STUDENT Y | |||||||||||
First Year Marks | Second Year Marks | Third Year Marks | Degree Outcome from 7 different algorithms | ||||||||
Module | MARK | Module | MARK | Module | MARK | Years used | Y1,Y2,Y3 | Y2 and Y3 only | |||
Module 1 | 54 | Module 7 | 56 | Module 13 | 60 | Discounting | No discounting | 20 credits discounted | |||
Module 2 | 60 | Module 8 | 58 | Module 14 | 65 | University | A | B | D | E | F |
Module 3 | 62 | Module 9 | 60 | Module 15 | 67 | Algorithm | 1 | 3 | 8 | 9 | 14 |
Module 4 | 65 | Module 10 | 60 | Module 16 | 68 | Total credits | 360 | 240 | 240 | 200 | 200 |
Module 5 | 70 | Module 11 | 61 | Module 17 | 70 | Weighting [Y2 : Y3] | EQUAL | 50 - 50 | 20 - 80 | 50 - 50 | 20 - 80 |
Module 6 | 76 | Module 12 | 65 | Module 18 | 74 | Degree mark | 63.9 | 63.7 | 65.9 | 64.8 | 67.2 |
Average | 64.5 | Average | 60 | Average | 67.3 | Classification | Low 2:1 | Low 2:1 | Mid 2:1 | Low 2:1 | High 2:1 |
Average of best 100 credits: | 60.8 | 68.8 | Difference in Degree mark [F - A] = 3.3 percentage points |
The right-hand panel shows the degree classification each student would receive from 5 different universities (A to F) each using a different algorithm (1, 3, 8, 9 and 14). The average of each year is shown along with the average of the best 100 credits (where the marks for modules 7 and 13 are discounted), these are the averages used by universities E and F (algorithms 9 and 14).
In this example, the degree marks and classifications are similar using algorithms 1, 3, and 9 – both students would achieve a low 2:1. However, had both students studied at university F student X would receive a 1st, while student Y would receive a high 2:1.
The initial implication is that for both students the choice of university can play a big role in the degree classification they eventually achieve (all other things being equal). The ancillary concern is whether the 1st student X would receive from university F is representative of what most people think a 1st would ‘look like’.
But is the current problem one of classification inflation?
Concerns for equity apart (we’ll return to this below) if over the years a significant proportion of UK universities gradually changed their degree algorithms, for example moving from that used by University B to that of University F, then we could expect to see a rise in the number of 1st and 2:1s (irrespective of any increases in the marks used by the algorithms).
And this gradual change in UK algorithms has indeed occurred. The Higher Education Academy (HEA) found that nearly half of the institutions it surveyed in 2015 (98 in total) changed their award algorithms in the previous five years so as not to disadvantage students in comparison to those in similar institutions. Reporting the HEA findings, HEFCE (2015) commented that those within the sector are rather comfortable with current approaches – adding that there is little evidence of an effective counter-narrative to regular media claims of “grade inflation” in undergraduate degrees. Some commentators are more forthright – Nick Hillman (director of HEPI) believes universities are essentially massaging the figures, changing their algorithms to improve the marks for borderline candidates.
Nevertheless, given that most universities have now changed their degree algorithms, classification inflation will probably become less of a problem in the immediate future, which leaves us with the issue of equity.
The potential for inequality
For UK universities who cherish their autonomy, the differences in the degree outcomes for student X and Y are irrelevant; the issue of fairness or comparability is not up for discussion.
Notwithstanding this, academics – including Woolf and Turner (1997), Curran and Volpe (2003), Yorke (et al) (2004), Yorke (et al) (2008) and more recently Allen (2018) – have examined the impact of this diversity in UK degree algorithms. The emerging view is that it is difficult to justify a system where the same marks will lead to a different outcome purely based on the student’s choice of university.
Methodologically, this research is straightforward: take the marks of an existing set of students and apply a range of algorithms to see what happens to the distribution of degree classifications. In the process, it is possible to estimate the proportion of students who would achieve a different classification. These proportions range from 15% to 38%, depending on the date of the analysis, the sample size, and the range of algorithms used.
So we must ask why some students could receive a different classification via a different algorithm – and others (the majority) will not. The answer relates to consistency in mark attainment, and how the different algorithms treat that consistency.
Discounting only the lowest marks accommodates those students whose module marks in any one year (for whatever reason) are not consistent in a negative way; however, the student with consistent module marks will not benefit from this discounting as their module marks are similar. Likewise, a higher weighting on year three marks will be of little consequence for the students whose year two and three marks are similar. In the above examples, student Ys marks could be described as consistent compared to student X’s marks, which is borne out by the difference in the calculated degree classification between university F and A (algorithms 1 and 14).
For student X, the potential of a 1st from University F occurs solely because discounting has removed the lower marks in both years while the difference in weightings has exaggerated the impact of the year three marks. This is is a perverse outcome. While all university lecturers would welcome a student’s recovery in mark attainment (well done, student X!) it is likely they would not want to see this rewarded in preference to other students whose mark attainment is laudable for other reasons.
The policy response?
At one level the problem could be seen as one of a lack of standardisation in degree algorithms – as we’ve seen, the same set of marks can attract a different classification depending on which university a student attends. At another level, the problem is linked to the design choices within those algorithms – that discount and use differential weighting with the effect that they advantage one group of students over another.
The HEFCE (2015) consultation paper asked whether guidance should be developed on a sensible range of degree classification algorithms at the pass/fail and 2:1/2:2 borderlines. The inference is clear: the current arrangements with UK degree algorithms are not particularly sensible and a more standardised approach would be desirable for the majority of stakeholders.
In this respect, Andrew Wathey’s (2018) observation (as chair of UKSCQA) that a convergence in algorithms might be a solution to grade inflation, is a promising development. Nevertheless, if equity is a meaningful policy aspiration, any such standard algorithm should not involve discounting or differential weighting.
The response made by Universities UK (UUK) to the HEFCE consultation paper was blunt: UUK does not believe that it would be proportionate for the review to explore standardised degree algorithms. Instead, UUK would prefer to ensure that there are not patterns of gearing in degree algorithms that might be introducing grade inflation pressures. How this could be achieved, however, is not explained.
There is one simple solution to this potential (if not long running) impasse which also allows universities to retain their cherished autonomy. Universities can continue classifying their student’s achievements using the traditional classification, according to their preferred algorithm, but they would also be obliged to publish the student’s GPA score based on all the student’s marks across all years of study. The benefits are obvious and numerous and – to me – any suggestion that such a dual system might confuse people is disingenuous and patronising.
Very good post that presents actual data and analysis, includes sources, and presents a balanced assessment.
The final point is pretty unassailable.
Not sure GPA helps us out – how we (universities) arrive at a ‘mark’ for any given module is confused – some universities, for example, allow mitigation to be applied in order for students to improve their mark, others only allow such in the case of failure. Then there’s how many attempts (or not) students are allowed, and how they are capped (or not). Or penalties set for malpractice. Rounding, all individual module marks and final outcome or one or the other. How trailing modules are treated, capped or for the merit mark. And so on.
As well as weighting, discounting and number of years, I’d include upgrade rules!
I’d also note Bachan’s work (University of Brighton) and his use of stochastic frontier modelling – generalising (apologies), he shows that 80% of ‘inflation’ is down to evidenced university efficiencies (many of us institutions have been around a while, you’d hope we had gotten better at what we do, especially given the money ploughed into teaching quality i.e. FDTL, NTFS, TQEF, CETLs etc) so he argues it is grade improvement, but 20% cannot be easily explained – which may be down to ‘inflation’ – but equally may be down to efficiencies we have yet to identify…
Katie A, thank you for your detailed comments. The GPA would help in that it is a simple average of all modules studied (no weighing’s, no discounts discounts) and its easily understood. With GPA if module marks were being inflated (over time) we would see a rise in 1st and 2:1s.
You are right when you note all the other things universities ‘take in to consideration’ when awarding degree classes – the most significant being the treatment of borderline decisions (see Allen 2018 for a discussion).
Thank you also for pointing me in the direction of Bachan (2015)
[https://www.tandfonline.com/doi/pdf/10.1080/03075079.2015.1019450 ], looking at the variables used in this paper it looks like it is attempting to measure classification inflation – not grade inflation, simply because module marks are not being used in the analysis.
Notwithstanding, we would hope that some of the rise in 1st and 2:1s is down to more motivated students and better provision by universities, but I think we have to accept that changes in degree algorithms (over the last 10 years) is the principal driver for recent classification inflation.
Thank you for replying, David – I suspect, unfortunately, we (universities) wouldn’t just ‘do’ GPA without weighting, discounting etc – I think we’d just ‘translate’ our current system into it!
I’d agree very much on borderline decisions – it’s odd we say 70% is the boundary for a First but then give a student with an average of 68% a First, simply because of an upgrade rule…!
Yes, I read (and enjoyed) your piece (and used it for a presentation I did on GI for QAA recently).
I’m not sure what the solution is… But I am sure I know what it isn’t – which starts with the Reform report and Tom Richmond’s curious proposals!
Agreed – the Reform report is rather odd! I think if people really want a HE classification system that is more robust, transparent, fair and simple, then they will have to impose it on universities … over to the NUS, OfS and Sam Gyimah …
Hoopla!
I’m surprised that there is no reference in this very interesting and valuable article to the work of the Burgess Group and the development of the Higher Education Achievement Report (www.hear.ac.uk) . The original intention of this extensive exercise was to create an alternative approach to degree classification which would provide objective and validated information about what a student has achieved on a degree programme. It was hoped by some of us involved that degree classes would wither on the vine. According to the HEAR website nearly half a million HEARs have been awarded already. The five-point degree classification is the ultimate dead parrot of HE, which institutions wilfully refuse to recognise as a hopelessly outdated and misleading way of sorting sheep from goats. There is no point in spending billions of pounds to get disadvantaged young people into HE if they are going to be re-disadvantaged on exit by the discredited classification system. The Government seems wilfully blind to the inherent deficiencies of the system, not just its operation, and needs to wake up to this shocking anomaly.
Unfortunately Peter the word-limit prevented me form making any additional discussion on the HEAR – which given its purpose and scope relative to the traditional classification is indeed a great shame. Hopefully this work might help to bolster the wider adoption and acceptance of the HEAR as the definitive statement on a student’s achievements.
A fascinating and thought-provoking article – thank you.
The description of the LSE degree awarding algorithm seems out of date though, the current rules are:
“The classification of each student shall be based on nine ’classification marks’, comprising:
– the marks achieved in all eight second and third year papers
– a ninth mark being the average of the best three marks in first year papers”
Indicating Y1 counts for 1/3 of the weighting of Y2 and 3 and discounting of the lowest first year mark. There are then various rules about how these nine marks combine into a classification, found here: https://info.lse.ac.uk/Staff/Divisions/Academic-Registrars-Division/Teaching-Quality-Assurance-and-Review-Office/Assets/Documents/Calendar/SchemeBA-BSC-InOrAfter2007-08-OtherThanFourYear.pdf
.
Laura: Thank you for this correction … it is a tricky task sorting out all the variation in HE algorithms, nevertheless, I apologise for any misunderstandings.