All research projects create or utilise data, which are necessary to support or validate findings, observations and outputs. Naturally, data will vary in types across disciplines, and techniques will differ in how they are collected. The management of research data (regardless of the form or the media in which they exist in), is essential for good research practice, ensuring that data are organised, preserved and in the long-term, reusable.
Coventry University recognises that good research data management is fundamental to high quality research and academic integrity. These LibGuides pages provide guidance and tools to support the management, re-use and preservation of research data.
Research Data Lifecycle. Highlights the different stages that research data may encounter in a project, and indicates systems and services that are required to support data management.
©Jisc and Bonner McHardy (page last updated 2/10/17)
Many funding bodies require that their funding recipients create and follow plans for managing data, storing or preserving it in the long term, and sharing some, or all data products with the public. The Digital Curation Centre (DCC) has provided a convenient Overview of Individual Funders' Data Policies.
Individual councils and other funding agencies vary in the degree of planning and explanation they expect at the bidding stage, but some, for example the Wellcome Trust, have very strong expectations:
"All those seeking Wellcome Trust funding should consider their approach for managing and sharing data at the research proposal stage. In cases where the proposed research is likely to generate data outputs that will hold significant value as a resource for the wider research community, applicants will be required to submit a data management and sharing plan to the Wellcome Trust prior to an award bring made". (Policy on data management and sharing)
Even where such demands are not made, it is likely that funders will respond more positively to applications which have clear plans for managing, preserving and sharing their data.
The steps you apply to manage your data are also likely to affect the costing of your project. As well as the cost of staff and other resources, it is necessary to think about what your short and long-term storage and back-up requirements will be, for the safekeeping of the data you create during your research work. There are likely to be both funder, and institutional requirements related to this.
There are many decisions to make about managing your data before you even start creating/collecting it, including choosing hardware and software, and addressing issues with intellectual property rights and ethics. Decisions made at the beginning will affect how you can access, use, or preserve your data in the future.
Research data can exist in many forms, dependent on research area / discipline, including:
In planning a research project, it’s important that you consider which file formats you will use to store your data. In some cases, this will be dictated by the software you’re using or the conventions of your discipline, but in other cases you may have to make a choice between several options:
Popular formats such as those produced by Microsoft Office products (e.g. Word documents or Excel spreadsheets) are likely to have reasonable longevity, but be aware that they are proprietary (owned by someone) and so will not necessarily exist forever or remain easily readable. You may be better off storing important information in open, non-proprietary formats – for example, PDF/A rather than Microsoft Word, CSV rather than Excel, TIFF rather than Photoshop files, or as XML rather than a database.
Some images formats are better for particular purposes than others. For example, TIFFs preserve digital image information well, but users cannot view them with internet browsers and they take up a lot of computer storage space. Taking this into consideration, TIFF image files would make suitable master copies for archival purposes, particularly if the image content is important. For smaller images which are to be used for web delivery and for embedding in documents, JPEG format is suitable. JPEGs are compressed using 'lossy', which keeps the files from being too large. Each time a particular JPEG image is compressed, it loses some of its information, so over time, the image becomes blurry. This process means that JPEGs are not considered for archival processes.
The link below directs to the Digital Preservation Coalition's Handbook, which provides useful information on all aspects of digital preservation.
Once you create, gather, or start manipulating data and files, they can quickly become disorganised. To save time and prevent errors later on, you and your colleagues should decide how you will name and structure files and folder. Including documentation (or 'metadata') will allow you to add context to your data so that you and others can understand it in the short, medium, and long-term. Good metadata should be both computer and human-readable.
Agreeing on a logical and consistent naming convention at the beginning of your project will make it easier to find and correctly identify your files, prevent version control problems when working on files collaboratively, and generally prevent errors in research. Organising your files carefully will save you time and frustration and prevent duplication or errors by helping you and your colleagues find what you need when you need it.
It is useful if your department/project agrees on the following elements of a file name:
Very few documents are drafted by one person in one sitting. More often there will be several people involved in the process and it will occur over an extended period of time. Without proper controls this can quickly lead to confusion as to which version is the most recent. Here is a suggestion of one way to avoid this happening:
Use a 'revision' numbering system. Any major changes to a file can be indicated by whole numbers, for example, v1 would be the first version, v2 the second version. Minor changes can be indicated by increasing the decimal figure for example, v1.01 indicates a minor change has been made to the first version, and v3.01 a minor change has been made to the third version.
When draft documents are sent out for amendment, they should return carry additional information to identify the individual who has made the amendments. Example: a file with the name 20100816_dataman_v1_sj indicates that a colleague (sj) has made amendments to the first version on the 16th August 2010. The lead author would then add those amendments to version v1 and rename the file following the revision numbering system.
Include a 'version control table' each important document, noting changes and their dates alongside the appropriate version number of the document. If helpful, you can include the file names themselves along with (or instead of) the version number.
Agree who will finalise documents, marking them as 'final.'
Many researchers fear that by sharing their data they will lose their competitive edge, that others will misinterpret or misuse their data or that their research methods will be open to scrutiny. However, there also benefits to be gained though sharing your data. For example it:
If you plan for data sharing from the beginning of your project, you can decide on a method of providing access that you are comfortable with.
Issues of intellectual property rights, commercial potential or of privacy can all affect whether you can or should share your data.
Sensitive and confidential data can, however, often be shared ethically if informed consent for data sharing has been given, subjects' identities are anonymised (if needed) or consideration is given to access restrictions.
These measures should be planned from the beginning of your research to ensure that you are not limiting future opportunities to share your data.
The UK Data Archive has an excellent guide on consent, confidentiality and ethics as part of their Managing and Sharing Data guide, and they provide brief guidance and tool reommendations for Anonymisation.
Please note: The University does not authorise or approve the use of DropBox. It should never be used for confidential, personal or sensitive data.
The Centralised Research Data Storage and Collaboration (CRDSC) is a University-hosted web platform that provides a central storage and collaboration space for documents, information and ideas. For example, a CRDSC site can help you:
Check out the IT Support tab above, or contact IT Services via the Centralised Research Data Storage and Collaboration website for advice on using the service to safely collaborate with external partners
Your data are only truly open if other people can access them, understand them and reuse them. It is recognised good research practice by researchers and institutions to manage and retain data, fulfilling any legal requirements that may exist following the conclusion of research projects. This requires active preservation to ensure that the files continue to be readable over the long term, making this an important feature of the research data lifecycle. You should ensure that the repository you choose has active preservation procedures for digital data curation. The Digital Curation Centre highlight data preservation as a key aspect to consider when planning a new research project, particularly with data that are unique and irreplaceable if destroyed or lost. Without the ability to refer to verifiable data, your research may not be judged as sound.
There are a growing number of Digital Repositories and data centres with varying content types (e.g. articles, data sets, images, etc) and disciplinary foci. The majority of them share data openly with the public, or the research community.
OpenDOAR (Directory of Open Access Repositories) maintains an online list of open access digital repositories, and has a content search tool.
re3data.org is the Registry of Research Data Repositories, providing a global registry of data repositories from different academic disciplines, and its use is particularly recommended in the European Commission’s “Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020”.
Online stores of discipline or subject-specific data ('data centres') abound, but there is currently no definitive list of these.
Some examples of popular data centres include:
Raise the impact of your research. Digital repositories allow you to make data easily accessible to more people than ever before. The more people who can use your data, the more public good it can do and the more it can do to enrich your field of research. Open online access makes new collaborations and uses of data possible. In some areas (e.g. Archaeological excavation data), the data is often unique and many researchers feel a moral compunction to make it available to others (and, of course, to ensure its long-term preservation).
Raise your research profile. The more other researchers cite your data, the more they will know and admire your work. As the trend toward online open access rises, the prestige associated with data citations is growing. In addition, making some data available can increase the credibility of your analyses.
Keep your data safe and readable in the long term. Many researchers hold on to an old computer from a decade or two ago because it is the only way to access their old files, created in formats that are now obsolete. Once these computers break, the files are essentially lost. Many repositories store and back up your treasured research products and will, if appropriate file formats are used, attempt to move the data into new file formats as the original formats become obsolete. So long as the repository exists, your materials will remain readable and usable.
Your funder may require it. This is more and more common. You can find summary of funders’ open access requirements using the SHERPA/JULIET database. Even if your funder does not require that you deposit your data, a plan to deposit your data may strengthen your bid.
If I published my paper/data in a peer-reviewed journal, can I still deposit it in an open digital repository?
This depends on the journal (especially for papers), but the majority do allow it. Contact your journal for more information, or, you can find summary information on journals’ copyright policies using the SHERPA/RoMEO database.
You can also ask your repository support team for help with this. Coventry University's Research and Scholarly Publications Team is happy to help you find an answer, regardless of your target repository.
Choosing what to keep and what can be disposed of or deleted is always going to involve a subjective judgement, as nobody knows exactly what information is going to be wanted in the future.
All we can do is think the matter through carefully, abide by the policies we need to (e.g. from funders) and document decisions made and the reasons for them. It won’t be a perfect process, but should at least be a sensible one.
There are some good reasons why selection is worth doing:
These following questions, based on material devised by the Digital Curation Centre, can help you decide what you should keep and what can be deleted:
Once you've sorted through your files and asked these questions you then need to:
To ensure that you understand your own data and that others may find, use and properly cite your data, it helps to add 'documentation' or 'metadata' (data about data) to the documents and datasets you create. This encompasses all the information necessary to interpret, understand and use a given dataset or set of documents.
It is good practice to begin to document your data at the very beginning of your research project and continue to add information as the project progresses. Include procedures for documentation in your data planning. There are a number of ways you can add documentation to your data:
Information about a file or dataset can be included within the data or document itself. For digital data sets, this means that the documentation can sit in separate files (for example text files) or be integrated into the data file(s), as a header or at specified locations in the file. Examples include:
This is information in separate files that accompanies data in order to provide context, explanation, or instructions on confidentiality and data use or reuse. Examples include:
This is structured information which can be used to identify and locate the data that meet the user's requirements via a web browser or web based catalogue. Catalogue metadata is usually structured according to an international standard and associated with the data by repositories or data centres when materials are deposited with them. Examples are:
Digital Object Identifiers (DOI) are a set of alphanumeric characters which give outputs such as articles, conference papers, datasets and book chapters, a unique online identity. DOIs are usually found on the publication itself or on the publisher's webpage and this unique string of characters provides a stable, persistent identification for the lifetime of the output. Even when the content of an object is updated or the web address is changed, a DOI record will be updated but its link remains the same. As a DOI is a stable point of reference for an output, its link will always work and will never change, once created.
If you are wondering whether you need a DOI, consider that the addition of a persistent identifier to your outputs will help increase the research and the impact of your work. DOIs are a method of identifying specific objects and works accurately, which will aid in the connection of the outputs to their creators / authors, plus associated metadata and documentation.
The Research and Scholarly Publications Team help and encourage researchers to share and register data, make them more stable to locate online and easier to cite, by obtaining a DOI. DOIs can be assigned to reports, theses, datasets, online toolkits, and working / technical papers.
Please ask for further guidance by contacting the Team.
Coventry University Group is committed to processing personal/sensitive personal data in accordance with:
GDPR places greater emphasis on organisations to comply with Data Protection legislation where organisation is acting as a Data Controller, who determine the purpose for which and the manner in which personal data are collected and processed (e.g. Coventry University is a Data Controller if your research is in connection with the University). The new Data protection regime also provides enhanced and new rights for individual such as Right to Data Portability, not all rights will apply where personal data is processed for the research project.
Please contact the University’s Information Governance Unit (IGU) who regulates governance and compliance over data protection and privacy matters. Please ensure you initiate the contact with IGU as early as possible in your project if you require advice or assistance on any data protection points or query.
Information Governance Unit - firstname.lastname@example.org
Further information can also be found on the Reporting Information Security Incidents or Data Breaches page, along with other Coventry University Group policies.
JISC Research Data Management Toolkit - Data protection regulation
Funders expect data plans to cover how data will be collected or created, managed, shared and preserved. Plans should include information on
UK Research and Innovation (UKRI) have set expectations on the routine management and sharing of research data, known as Common Principles. These common principles provide a framework for the individual Research Council policies on data policy.
The Digital Curation Centre (DCC) provide an overview of the coverage of individual funders' policies for publication and data, and the support that they provide for researchers. Full details are available directly from the individual funders' pages:
Arts and Humanities Research Council: Data Management Plan - Text for Funding Guide
Biotechnology and Biological Sciences Research Council: Data Management Plan Application Guidance
British Heart Foundation: How to apply for a research grant
Cancer Research: Practical guidance for researchers on writing data sharing plans
European Commission Horizon 2020: Data Management, Guidelines on FAIR Data Management in Horizon 2020
MRC: Data Sharing
Wellcome Trust: How to complete an outputs management plan
Funders expect data plans to cover how data will be collected or created, managed, shared and preserved. Plans should include information on
A good DMP will usually cover the following themes, but will vary in exact details and requirements dependent on the funder that is being applied to:
Much of data management is simply good research practice that you will be doing already. Data plans are just a way of articulating or evidencing that you've thought about how to create, store, backup, share and preserve your data. The DCC has produced an interactive online tool to help researchers create data management plans: DMPonline The website has a record of major UK/European funder requirements, so it can also tailor the template to your particular funder.
F.A.I.R. Data Principles are a set of principles to guide researchers in making their research data findable, accessible, interoperable and reusable (Wilkinson et al. 2016), directing data producers and publishers to promote maximum use of research data. The Principles also highlight the importance of data to be machine-readable, as humans rely on computers to search for and deal with increasing volumes of data, in addition to data complexity.
Following the FAIR Principles would be seen as good research practice by all Research Funders, particularly beneficiaries of Horizon 2020 funding. Data Management Plans for European Commission projects must address how datasets will be created, if these data can be made accessible and how they will be curated, stored and preserved. Further details can be found in the following documents:
Go Fair expand on the granular details of the F.A.I.R. Principles (CC-BY 4.0):
The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process.
Once the user finds the required data, she/he needs to know how can they be accessed, possibly including authentication and authorisation.
The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.
The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
The principles refer to three types of entities: data (or any digital object), metadata (information about that digital object), and infrastructure. For instance, principle F4 defines that both metadata and data are registered or indexed in a searchable resource (the infrastructure component).
Coventry University Information Technology Service has processes in place to ensure: the completeness and accuracy of the backed up data; backup copies of data are stored in a secure manner; data can be restored from a required backup within a reasonable time from authorisation from a data custodian; control of media rotation; secure storage; copies of data are taken on schedule; the secure disposal of data and magnetic media when they are no longer required.
Many departments and research groups provide networked storage and SharePoint spaces for collaborative work. File size limitations, back-up capabilities, and remote access can vary. Further details are available via the File Storage & Sharing page in the Digital Services Catalogue. Find this page under the Communication and Collaboration menu.
Portable storage media such as CDs, DVDs, memory sticks (also known as USB sticks, flash drives, thumb drives, memory keys) are more risky are vulnerable to loss and damage. It is important not to rely on them as your only copy of important data.
They are very convenient though, and useful for:
Data sticks or CD/DVDs must be encrypted before posting them to external collaborators.
Digital Services host a centralised storage facility for research data in on-going research projects, Centralised Research Data Storage and Collaboration (CRDSC). Storage is mirrored to a secure ISO27001 off-campus site and subject to a Disaster Recovery Plan with secure nightly backups taken. CRDSC project spaces provide the following features for researchers and their collaborators:
For further details and to request a project space, navigate to the CRDSC pages.
You may be creating and collecting entirely new data for your project, but you can often draw on a wealth of data already available to complement or enrich your own research. Given the proper attention to Intellectual Property Rights (IPR), Data Protection and ethics, you may be able to process existing raw data to create entirely new research outputs.
There are many reasons why you may wish to reuse data:
When reusing existing data, you must ensure that you use the data in the way that the data owner has specified. A method which authors and creators of outputs use to provide a clear, regulated method of providing permissions for reuse of works is through Creative Commons licenses. Those interested in reusing the data then follow the license conditions.
The following table visually explains Creative Commons Licenses and their permissions.
The following links will help you locate data repositories that may be of use to you.
Intellectual Property Rights (IPR) (e.g. copyright, patents, etc) affect the way both you and others can use your research outputs.
Failure to clarify rights at the start of the research process can lead to unexpected limitations to:
It can also cause you legal trouble.
Further information on IPR can be found on the University's IPR webpages
Frequently Asked Questions
Are research data or data derivatives protected by copyright law?
Copyright law sometimes protects data and other research products (provided that you share them with the proper copyright statement or end-user agreement), but it depends on the nature of your data or files.
The University has a Copyright webpage, which provides information and contacts for who to consult on copyright questions in various situations (e.g. research grants and funding, commercialisation and intellectual property, etc).
A seminar was held in 2011 (hosted by CRASHH and the Incremental Project). Andrew Charlesworth (Centre for IT & Law, University of Bristol) gave a presentation that addressed some copyright issues:
'Intellectual Property Rights and Research Data - Focus on copyright' [32 mins 6 secs].
He also participated in a short interview on the same subject [2 mins 34 secs].
What are my intellectual property rights with regard to research data at Coventry University?
This depends on whether you are a student, post-doc, PI/project director, your relationship with the University, your role in the project, and your agreement with other parties (funders, study participants, corporate partners, etc). Advice can be sought via the University's webpages and by contacting colleagues listed below, under the 'Who can help me with Copyright and IPR?' question.
Can I use materials that I find online?
It depends on how those materials are licensed. IPR is usually in play, even if you don't see a "©" or 'all rights reserved' notice. When in doubt, contact the University Copyright Officer (contact information in FAQ below) for advice, or ask the website administrator or publisher who distributed the content for permission directly.
How can I make it easier for others to re-use the materials that I produce?
One relatively simple way to make it easier for others to re-use tools, data, or other content that you produce is to add a Creative Commons license.
For example ‘By-Attribution, Non-Commercial’ is a common Creative Commons license – when you mark your file, image, or information with this, it means that anyone can use your information in any way they like, so long as they attribute it to you and don’t use it for commercial purposes. Creative Commons licenses are often used for materials released online, but you can also include these in printed materials if you don't have a publisher who owns the rights. For additional information and Creative Commons license options, visit the www.creativecommons.org.
To license something with a Creative Commons license, you don't need to file any paperwork -- just publish (in print or on the web) your materials along with a notification that you are using a particular license.
IMPORTANT NOTE: Creative Commons licenses are 'irrevocable' so don't add a Creative Commons license unless you are sure that (1) you have the right to publish this information, and (2) you won't want to re-voke it later on for any reason.
Who can help me with Copyright and IPR?
For information on Copyright, please contact:
University Librarian and Group Director of Learning Resources, LIB
Telephone: 024 7688 7519
For general questions on IPR or to discuss Intellectual Property Disclosure Forms, contact:
Business Development Support Office
Mobile: 07974 98 4387
Director of IP Services
Mobile: 07974 98 4928
For questions touching on commercialisation, contact:
IPR Commercialisation Manager
Mobile: 07557 42 5047
What rights do other people have to request my work - i.e. Freedom of Information Act (FOI)?
The Freedom of Information Act of 2000 (FOIA) gives all members of the public the right to request any information produced with public money, but there are some exemptions.
For information about FOI at Coventry, see the CU FOI page.
Web2Rights IPR & Legal Issues Toolkit Information on intellectually property rights pertaining to Web 2.0 internet resources.
Alex Ball has created a presentation for the Digital Curation Centre/University of Bath on Derestricting Datasets: How to License Research Data
As members of a publicly-funded university, you may receive requests for information under the Freedom of Information Act 2000 (FOI) or Environmental Information Regulations 2004 (EIR).
Once the University has received an information request, it has 20 working days to respond to an FOI request and up to 40 for an EIR request. Both FOI and EIR include a number of exemptions and exceptions respectively against disclosure. This is because the legislation recognised that not all official information ought to be disclosed. For example to protect information such as confidential, sensitive data or personal information. If you are unsure about disclosure, consult the University's FOI officer email@example.com.
Article Processing Charge (APC) - Fee which may be payable to the publisher to publish via the gold open access route. When an article is published in a traditional subscription journal, the author pays an APC to make their individual article freely available from the journal website, without restriction or charge to the reader.
CC-BY Licence - Creative Commons Attribution Licence. This is the most liberal of the CC licences. As long as the original author(s) receives attribution, this allows anyone to copy, distribute or transmit the research, adapt the research and make commercial use of the research. RCUK requires this licence is used if the gold open access route is selected. The Wellcome Trust encourage its use, and will cover the costs of any APC where an article is published under this licence.
Corresponding Authors - The author responsible for manuscript correction, correspondence during submission, handling of revisions and re-submission of the revised manuscript. On acceptance of the manuscript, the corresponding author is responsible for co-ordinating any application for payment of a Gold Open Access Article Processing Charge (APC).
Creative Commons Licences - Creative Commons licences can be used in open access publishing to help authors retain copyright while allowing others to copy, distribute, and make use of their work. There are several different Creative Commons licences, which allow different types of re-use. See the Creative Commons website.
Curve open - CURVE Open is the University's repository for educational resources and open access items other than research publications. The aim of this open access institutional repository is to showcase University research and teaching, increasing accessibility to, and raising the visibility of our authors work.
Open Access - Open access is the practice of providing free, unlimited online access to scholarly works and research outputs in a digital format, with limited restrictions on re-use. A key driver behind OA has been to make publicly-funded research accessible to tax-payers.
Research & Scholarly Publications
FL320, Lanchester Library
Frederick Lanchester Building
Coventry, United Kingdom
Telephone: 024 7765 7568
Open Access and Institutional Repository - firstname.lastname@example.org
Research Data Management - email@example.com