This is a recording of the Introduction to Research Data Management workshop which is delivered as part of the Doctoral College's Researcher Development Programme and the RKE Development Framework offered by Organisational Development.
In this session the terms 'research data', 'open data' and 'research data management' are defined. The rest of the session is divided into three parts:
Everything covered in this workshop is available to read or is linked to from this guide.
Most funders will have their own definition of research data, reflecting disciplinary differences. A cross-disciplinary definition would be the evidence collected or created as part of an academic study. It is the data that would be necessary to validate research conclusions, or which may be of interest to future researchers seeking answers to new questions.
The definition of research data includes primary, secondary, derived and bibliographic data.
This is data generated by researchers. For example, it would include direct measurements, survey responses, interview transcripts etc.
Secondary data is produced when primary data is processed for subsequent analysis. For example, it would include datasets which have been anonymised or 'cleaned-up'.
This would also include any software or source code used to process the primary data.
Derived or compiled data
These are datasets produced from existing sources. For example, a new dataset produced by combining multiple third party datasets. Again, it would include any software or source code used to do this.
Information derived from multiple secondary sources also falls into this category e.g. a database of information extracted from historical manuscripts or literary works.
This is data which has been published and curated, usually as part of a managed collection. It may include any of the above or archival collections of items such as historical documents, images, audio-visual recordings etc.
Data collected as part of the research project for governance or auditing purposes. For example, completed participant consent forms.
Acknowledgements - This definition has been inspired from the following sources:
University of Cambridge., 2019. FAQ [online]. Cambridge: University of Cambridge. Available from: https://www.data.cam.ac.uk/faq [Accessed 01 November 2019].
University of Reading., 2019. Research data defined [online]. Reading: University of Reading. Available from: http://www.reading.ac.uk/RES/rdm/about/res-research-data-defined.aspx [Accessed 08 November 2019].
Research Data Management (RDM) is concerned with how the data generated during research is handled throughout the research life-cycle and preserved for future reuse. The aim is to encourage good practice and maximise the value and impact of the data.
The research life-cycle model can be used to explain how good data management applies at every stage of a research project and the benefits to different stakeholders of doing so.
Each segment or slice represents a stage in the research process, from planning for data collection through to making it available for discovery and re-use. Questions that need answering to maximise value and impact might include:
The different stakeholders are written along the outer edge. They benefit from good research data practice in a number of ways:
The FAIR principles of data management ensure that data is Findable, Accessible, Interoperable and Re-useable. Together, they aim to maximise the utility and value of research data, and they are applicable throughout the research life-cycle. All research data produced by researchers at BU should align with FAIR principles and the guidance on our pages on Managing Data and Sharing Data align with them. If you are interested in finding out more about FAIR, the principles are explained in greater detail by the GO FAIR Initiative.
Open Data is data which has been made publicly available with few or no restrictions. BU is committed to the UKRI 'Common principles on data policy' to make publicly funded research data 'openly available with as few restrictions as possible in a timely and responsible manner'. However, FAIR data does not necessarily mean the data has to be open. Degrees of 'FAIR-ness' are recognised (Wilkinson et al. 2016). For example, highly sensitive and personally identifiable data may need access restrictions imposed. The common principles also recognise legal, ethical and commercial constraints. To maximise the benefits, BU operates on the principle "as open as possible, as closed as necessary" (Horizon 2020).
Benefits of FAIR and Open Data
Benefits to researchers include:
Access BU's Research Data Policy, effective from 23 January 2020.
Research data also forms part of the BU Code of Good Research Practice (January 2019). The relevant sections include:
Principles of good practice in research (Chapter 2)
Collaborative working & international research (Chapter 6)
Leadership and supervision (Chapter 7)
Research design (Chapter 9)