There are many decisions to make about managing your data before you even start creating/collecting it, including choosing hardware and software, and addressing issues with intellectual property rights and ethics. Decisions made at the beginning will affect how you can access, use, or preserve your data in the future.
Research data can exist in many forms, dependent on research area / discipline, including:
(list adapted from Leeds University) |
In planning a research project, it’s important that you consider which file formats you will use to store your data. In some cases, this will be dictated by the software you’re using or the conventions of your discipline, but in other cases you may have to make a choice between several options:
What formats are best for preserving files in the long term?
Popular formats such as those produced by Microsoft Office products (e.g. Word documents or Excel spreadsheets) are likely to have reasonable longevity, but be aware that they are proprietary (owned by someone) and so will not necessarily exist forever or remain easily readable. You may be better off storing important information in open, non-proprietary formats – for example, PDF/A rather than Microsoft Word, CSV rather than Excel, TIFF rather than Photoshop files, or as XML rather than a database.
What image format should I use?
Some images formats are better for particular purposes than others. For example, TIFFs preserve digital image information well, but users cannot view them with internet browsers and they take up a lot of computer storage space. Taking this into consideration, TIFF image files would make suitable master copies for archival purposes, particularly if the image content is important. For smaller images which are to be used for web delivery and for embedding in documents, JPEG format is suitable. JPEGs are compressed using 'lossy', which keeps the files from being too large. Each time a particular JPEG image is compressed, it loses some of its information, so over time, the image becomes blurry. This process means that JPEGs are not considered for archival processes.
The link below directs to the Digital Preservation Coalition's Handbook, which provides useful information on all aspects of digital preservation.
Once you create, gather, or start manipulating data and files, they can quickly become disorganised. To save time and prevent errors later on, you and your colleagues should decide how you will name and structure files and folder. Including documentation (or 'metadata') will allow you to add context to your data so that you and others can understand it in the short, medium, and long-term. Good metadata should be both computer and human-readable.
Agreeing on a logical and consistent naming convention at the beginning of your project will make it easier to find and correctly identify your files, prevent version control problems when working on files collaboratively, and generally prevent errors in research. Organising your files carefully will save you time and frustration and prevent duplication or errors by helping you and your colleagues find what you need when you need it.
It is useful if your department/project agrees on the following elements of a file name:
Very few documents are drafted by one person in one sitting. More often there will be several people involved in the process and it will occur over an extended period of time. Without proper controls this can quickly lead to confusion as to which version is the most recent. Here is a suggestion of one way to avoid this happening:
Use a 'revision' numbering system. Any major changes to a file can be indicated by whole numbers, for example, v1 would be the first version, v2 the second version. Minor changes can be indicated by increasing the decimal figure for example, v1.01 indicates a minor change has been made to the first version, and v3.01 a minor change has been made to the third version.
When draft documents are sent out for amendment, they should return carry additional information to identify the individual who has made the amendments. Example: a file with the name 20100816_dataman_v1_sj indicates that a colleague (sj) has made amendments to the first version on the 16th August 2010. The lead author would then add those amendments to version v1 and rename the file following the revision numbering system.
Include a 'version control table' each important document, noting changes and their dates alongside the appropriate version number of the document. If helpful, you can include the file names themselves along with (or instead of) the version number.
Agree who will finalise documents, marking them as 'final.'
Many researchers fear that by sharing their data they will lose their competitive edge, that others will misinterpret or misuse their data or that their research methods will be open to scrutiny. However, there also benefits to be gained though sharing your data. For example it:
If you plan for data sharing from the beginning of your project, you can decide on a method of providing access that you are comfortable with.
Issues of intellectual property rights, commercial potential or of privacy can all affect whether you can or should share your data.
Sensitive and confidential data can, however, often be shared ethically if informed consent for data sharing has been given, subjects' identities are anonymised (if needed) or consideration is given to access restrictions.
These measures should be planned from the beginning of your research to ensure that you are not limiting future opportunities to share your data.
The UK Data Archive has an excellent guide on consent, confidentiality and ethics as part of their Managing and Sharing Data guide, and they provide brief guidance and tool reommendations for Anonymisation.
Please note: The University does not authorise or approve the use of DropBox. It should never be used for confidential, personal or sensitive data.
Digital Services recommend Microsoft Teams as a secure space for data(set) and document storage, and as an online location to interact with colleagues. For example, a Teams site can help you:
Contact Digital Services for advice on using the service to safely collaborate with research partners.
Contact Us📍 Where to find us:FL320, Lanchester Library
|
✉️ Email: oa.lib@coventry.ac.uk
|
|