Skip to Main Content

Research Data Management: Introduction

Research Data Management (RDM) Library guide

Introduction to Research Data management workshop

This is a recording of the Introduction to Research Data Management workshop which is delivered as part of the Doctoral College's Researcher Development Programme and the RKE Development Framework offered by Organisational Development.

Outline

In this session the terms 'research data', 'open data' and 'research data management' are defined. The rest of the session is divided into three parts:

  • The benefits of good Research Data Management - why it is needed
  • Data Management Plans (DMPs) - what you need to do to get started
  • Online resources and tools available to you - where to go for support

Everything covered in this workshop is available to read or is linked to from this guide.

What is research data?

Most funders will have their own definition of research data, reflecting disciplinary differences. A cross-disciplinary definition would be the evidence collected or created as part of an academic study. It is the data that would be necessary to validate research conclusions, or which may be of interest to future researchers seeking answers to new questions.

The definition of research data includes primary, secondary, derived and bibliographic data.

Primary data

This is data generated by researchers. For example, it would include direct measurements, survey responses, interview transcripts etc.

Secondary data

Secondary data is produced when primary data is processed for subsequent analysis. For example, it would include datasets which have been anonymised or 'cleaned-up'.

This would also include any software or source code used to process the primary data.

Derived or compiled data

These are datasets produced from existing sources. For example, a new dataset produced by combining multiple third party datasets. Again, it would include any software or source code used to do this.

Information derived from multiple secondary sources also falls into this category e.g. a database of information extracted from historical manuscripts or literary works. 

Bibliographic data

This is data which has been published and curated, usually as part of a managed collection. It may include any of the above or archival collections of items such as historical documents, images, audio-visual recordings etc. 

Administrative data

Data collected as part of the research project for governance or auditing purposes. For example, completed participant consent forms.

 

Acknowledgements - This definition has been inspired from the following sources:

 

University of Cambridge., 2019. FAQ [online]. Cambridge: University of Cambridge. Available from: https://www.data.cam.ac.uk/faq [Accessed 01 November 2019].
University of Reading., 2019. Research data defined [online]. Reading: University of Reading. Available from: http://www.reading.ac.uk/RES/rdm/about/res-research-data-defined.aspx [Accessed 08 November 2019].

What is Research Data Management (RDM)?

Research Data Management (RDM) is concerned with how the data generated during research is handled throughout the research life-cycle and preserved for future reuse. The aim is to encourage good practice and maximise the value and impact of the data.


The research life-cycle model can be used to explain how good data management applies at every stage of a research project and the benefits to different stakeholders of doing so.

The segments
Each segment or slice represents a stage in the research process, from planning for data collection through to making it available for discovery and re-use. Questions that need answering to maximise value and impact might include:

  • What do I need in place before collecting the data? For example, ethical approval or funding for suitable data storage? (Plan and Design)
  • What collection methods are being used? How will data be captured and recorded? Have these processes been properly documented? (Collect and Capture)
  • Has the methodology or code needed to interpret and analyse the data been properly documented? (Interpret and Analyse)
  • The data that's been collected and analysed needs to be properly managed. For example, have effective file naming conventions been followed? Has the data been backed up? Is the data secure? (Manage and Preserve)
  • Which repository would be most appropriate for publishing the data? What restrictions (if any) are needed? What licence would be most appropriate for the data? (Release and Publish)

The stakeholders
The different stakeholders are written along the outer edge. They benefit from good research data practice in a number of ways:

  • The Academic Community benefit from increased access to quality data for new research and the ability to validate research findings, helping to tackle the reproducability crisis. Data citations could also be used as evidence of impact for the Research Excellence Framework (REF).
  • The Research Councils (now UKRI, formerly RCUK) and other funders benefit from value for money, as their investment goes further if the data is available for other researchers to analyse. Many funders now mandate that data be managed effectively (through production of a Data Management Plan) and that data be made as openly available as possible after the project is completed.
  • Public confidence in academic research may be helped with greater transparency and the availability of data which underpin research findings. There is also the argument that publicly funded research should be available for the public to access.

Jones, K. 2011. Research360: managing data across the institutional research lifecycle [model]. Edinburgh: Digital Curation Centre.

Benefits of FAIR and Open Data

The FAIR principles of data management ensure that data is Findable, Accessible, Interoperable and Re-useable. Together, they aim to maximise the utility and value of research data, and they are applicable throughout the research life-cycle. All research data produced by researchers at BU should align with FAIR principles and the guidance on our pages on Managing Data and Sharing Data align with them. If you are interested in finding out more about FAIR, the principles are explained in greater detail by the GO FAIR Initiative.

Open Data is data which has been made publicly available with few or no restrictions. BU is committed to the UKRI 'Common principles on data policy' to make publicly funded research data 'openly available with as few restrictions as possible in a timely and responsible manner'. However, FAIR data does not necessarily mean the data has to be open. Degrees of 'FAIR-ness' are recognised (Wilkinson et al. 2016). For example, highly sensitive and personally identifiable data may need access restrictions imposed. The common principles also recognise legal, ethical and commercial constraints. To maximise the benefits, BU operates on the principle "as open as possible, as closed as necessary" (Horizon 2020).   


Benefits of FAIR and Open Data

Benefits to researchers include:

  • Data becomes available for researchers to make new discoveries by either combining datasets or asking new questions. Good data practice avoids the difficulties highlighted in the video!
  • Increase citations! Data deposited in a research data repository is citable, and could potentially increase the profile of your work. Data citations would also demonstrate research impact.
  • It has the potential to make research easier and help to avoid disasters (such as data loss or data protection breach) by keeping data organised, secure and understandable.
  • Helps to tackle the 'reproduceability crisis' through effective documentation. The data can also help with validating research findings.
  • Research funding. Funders increasingly require, as a condition of funding, researchers to demonstrate good data practice and to make data openly available at the end of the project. Familiarising yourself with the information in this guide could help improve your chances of success when bidding.

BU Research Data Policy

BU Research Data Policy

Access BU's Research Data Policy, effective from 23 January 2020.

BU Code of Good Research Practice

Research data also forms part of the BU Code of Good Research Practice (January 2019). The relevant sections include:


Principles of good practice in research (Chapter 2)

  • Ensure research designs, methodologies, data, findings and results are open to scrutiny (subject to appropriate confidentiality...)
  • Ensure the accuracy, security, accessibility and completeness of data and results…
  • Ensure data and results are retained and deleted/destroyed in accordance with all legal, ethical, funding body and University requirements.
  • Contribute to and promote the open exchange of ideas, research methods, data and results and their discussion, scrutiny and debate, subject to any considerations of confidentiality…


Collaborative working & international research (Chapter 6)

  • Try to anticipate any issues that might arise as a result of working collaboratively and agree jointly in advance how they might be addressed, communicating any decisions to all members of the research team (e.g. data management and sharing, intellectual property)
  • Contact RDS (funded research) so that an appropriate agreement can be put in place which clearly outlines the specific roles of the researchers involved in the project and on issues relating to intellectual property, publication, data management and the attribution of authorship.


Leadership and supervision (Chapter 7)

  • Academic members of staff who will undertake doctoral supervision must be clear about their roles and responsibilities which are set out in 8A Code of Practice for Research Degrees. In particular, as Doctoral Supervisors, they must ensure their PGR(s) adheres to all BU's policies, procedures and regulations including but not limited to, research ethics, health and safety, academic misconduct, copyright, research data management, data protection and intellectual property rights.


Research design (Chapter 9)

  • When designing research projects , researchers should ensure that … The design and conduct of the study, including how the data will be gathered, analysed and managed, are set-out in detail in a pre-specified research plan or protocol.