"“A crucial part of ensuring that research data can be shared and reused by a wide range of researchers for a variety of purposes is by taking care that those data are accessible, understandable and (re)usable.” (Documenting Data, UK Data Service)"
Documentation is best created alongside the data project, as it is easier to capture it then, rather than trying to remember to do things at a later stage. Make sure that there are strong links between your data and the associated documentation, e.g.:
- Include information within the data or document itself, e.g. in the document properties function of a file or the file header;
- Keep a database of metadata with links to files;
- Store a readme.txt file alongside the data which provides basic explanatory details;
- Record relevant context in lab notebooks or associated papers and reports;
- Link to websites or web pages which explain the context of the research.
Embedded documentation
Information about a file or dataset can be included within the data or document itself. For digital data sets, this means that the documentation can sit in separate files (for example text files) or be integrated into the data file(s), as a header or at specified locations in the file. Examples of embedded documentation include:
- Code, field and label descriptions
- Descriptive headers or summaries
- Transcripts
- Recording information in the Document Properties function of a file (Microsoft)
- You can use the fields in Document Properties to add contextual information to your MS Office documents. Not only will this help you keep your files organised and possible to interpret, but it will also allow you to sort folders by properties that you have added and search for documents with particular properties.
Supporting documentation
This is information in separate files that accompanies data in order to provide context, explanation, or instructions on confidentiality and data use or reuse. Examples of supporting documentation include:
- Information about the project and data creators;
- Working papers or laboratory notebooks
- Blank questionnaires and interviewer guidelines
- Codebooks
- Details on how the data were created, analysed, anonymised etc;
- Final project reports and publications
The documentation described here is meant to be read and used by humans, sometimes called human readable metadata. There is another kind of metadata meant to be used by computers and search engines.
Laboratory notebooks
Lab notebooks, whether in print or electronic form, are a critical component of tracking and recording research. Consistent documentation of your research methods, calculations, and results is important not only for your personal use, but will help when you publish or otherwise share research, and when others want to reproduce what you have done.
Listed below are links to two guides. Please let us know if there are other guidelines that are used in your lab, School, or Research Institute.
Electronic Lab Notebooks
An article by Jim Giles in Nature highlighted some of the benefits of going paperless in the lab - more detailed records, more accessible data sharing (no hard to read handwriting), improved efficiency, and even the potential to extract results from data due to improved methods of searching and analysing.
For more information see,
Reference Manager Software
- Reference management software can be used to store details of all the articles, books, and other sources you make use of in your research, and to automatically generate citations in written work.
- You can also use reference management software to store copies of articles (usually as PDFs), and to record your own notes. Some software packages offer additional features, such as the ability to annotate PDFs.
- Popular reference managers include EndNote, RefWorks, Mendeley, Zotero, and Colwiz. For EndNote, HKU has a site license, and training and support are available.