The work to build a more efficient and effective higher education information landscape, which kicked off with the 2011 White Paper, has been an especially interesting journey for me. As time has gone on I have come to realise that it is less about data, systems and processes and actually more about people, culture and mindset; it is a question of changing behaviours and relationships across the sector.
This includes changing attitudes to the value of data within institutions, changing behaviours on the part of the data collectors and having a better informed conversation about burden.
The Data Landscape Steering Group
When HEDIIP – the change programme – concluded in 2016 we set to work implementing the elements of the architecture that it defined. This includes things like the HESA Data Futures programme, which will provide a platform for the rationalisation of data flows across the sector, and implementing the new HECoS subject coding system. The HEDIIP architecture emphasises the need for oversight and leadership of the data landscape in order to drive the changes in behaviours and improvements in data capabilities. The Data Landscape Steering Group has been established to fulfil that function. The group is made up of leading figures from providers and from the world of data collection, and it is supported by a broader Advisory Panel which focuses on more detailed and operational issues.
These groups have been working recently on the development of a code of practice for data collectors which will complement the code of practice that higher education providers already have to adhere to when making data returns. The fruits of that work are now out for review and consultation, and this is an opportunity to shape and improve the relationship between data collecting organisations and higher education providers.
Understanding burden
A key element of the proposed new code of practice is an approach to assessing the burden of data collections. This has been one of the most interesting aspects of this work and follows many years of discussion about the burden of data collection and a number of studies into the broader issues of regulatory burden in universities.
While there is general consensus that the burden of data collection should be reduced – even to the point of writing it into sections 64 and 65 of the 2017 Higher Education and Research Act – we do not have a common understanding of exactly what drives the cost of responding to requests for data or how to quantify burden. The consultation sets out a model of burden assessment which attempts to address both these points.
We think that there are two broad issues that drive the cost of responding to requests for data. The first driver is the burden associated with the data request. This will include things like the extent to which institutions need to capture new data or translate existing data to meet the demand; things that are additional to what they would otherwise be doing to satisfy their own business requirements.
The second driver is a gearing factor that reflects the efficiency and effectiveness of the provider’s existing data management and governance; the idea that a seemingly simple data request can sometimes be a painful experience for providers if data is not well managed within the institution. Previous work on data capabilities tells us that the levels of data capability across the sector are variable; there are examples of good practice and there are areas that have room for improvement.
The proposed approach to assessing burden includes improving the dialogue between data collectors and higher education providers, in order to develop a shared understanding of burden and an assessment of how it falls between the two aforementioned causes. The dialogue should also include an assessment of the value of the data that is collected so that a more nuanced understanding of net burden can be reached.
This consultation offers an opportunity to enhance and refine an approach to the behaviour of data collectors and the assessment of burden. If we are serious about reducing the burden of data collections then we need a shared understanding of what causes it, and how we might measure it.
I think this is right about the need to improve the dialogue between data collectors and institutions. A clearer idea as to why something is being requested and the purposes it will be put to, would lead to a far more productive discussion as to how it was collected and the issues that will arise. Historically things have appeared in the HESA return without that clarity and so the first year’s data is of debatable value. We all then sit in the post collection seminar and Dan tells us what we should have been doing and so the data improves.
Far better to have the discussion in advance, agree coding frames and validation rules and then implement. It would lead to far better data quality, as often it is the users in an institution who can identify the issues with the proposal.
Having said that, the idea of a code of practice for data collectors is a very positive step in what is increasingly seeming like a journey to a better, if scarier, place.