Skip to Main Content

Research Data Management: Selecting data to deposit

Research Data Management (RDM) Library guide

Selecting data to deposit

BORDaR is BU's research data repository. Research data is the evidence collected or generated during the course of research with a view to its use as a basis for research findings. For example, interview transcripts, survey responses, images, observational data, results from experiments etc. BU staff and postgraduate researchers are required under the Research Data Policy to deposit their data in BORDaR (or another suitable data repository) following the end of the research project.


Selecting data for deposit

The Data Management Plan (DMP) completed at the start of the project should have identified any data selected for long-term preservation. This is any data needed to validate research findings, or which could be valuable to future research. For example, if research findings in a thesis or publication are based on interviews, the transcripts would be needed for anyone to re-run an analysis. In addition, if a researcher decides to conduct similar research in 10 years time, the transcripts could be useful for comparative analysis.

However, it may not be appropriate to deposit everything:

  • There may be legal or ethical limitations to data sharing which take precedence. For example:
    • It may not be possible to fully anonymise the data.
    • Where contractual agreements, trade secrets or commercial sensitivity prohibit data sharing.
    • Other high-risk data
    • Datasets which include 3rd party data.
  • The cost of long term storage means it is necessary to be selective. For example:
    • Some data can be easily reproduced provided all the necessary documentation is made available. Where that is the case, it is not necessary to deposit the data itself. For example, the output from a computer simulation could be reproduced if all the relevant inputs (source data, software requirements, code etc.) are documented and made available instead.
    • Some formats are larger and more costly than others to store, so consider which formats are necessary to preserve. For example, interview data may have been recorded and transcribed. The audio file will be much larger than text file. Unless the research relies on the recorded sounds (e.g., a study on regional accents) it would be better to only deposit the transcription. This would make anonymising the data easier too!