The HKU Policy on the Management of Research Data and Records sets several conditions for the retention and storage of research data. Amongst these are,

  • Research data and records should be retained for as long as they are of continuing value to the researcher and the wider research community, and as long as specified by research funder, patent law, legislative and other regulatory requirements.  The minimum retention period for research data and records is three years after publication or public release of the work of the research.  In many instances, researchers will resolve to retain research data and records for a longer period than the minimum requirement.
  • Researchers are responsible for [..] planning for the ongoing custodianship (at the University or using third-party services) of their data after the completion of the research or, in the event of their departure or retirement from the University, reaching agreement with the head of department/faculty (or his/her nominee) as to where such data will be located and how this will be stored;

To fulfill these requirements, HKU research data can be deposited into the HKU Scholars Hub, or other repositories, some of which are described below.

What to Deposit?

The emphasis of the HKU RDM initiative is on "research integrity". Research results claimed in publications must be reproducible. Replication datasets must be preserved to enable this later reproducibility. All data, scripts, questionnaires, codebooks etc. necessary for a third party to arrive at the same research results claimed must be preserved.

As part of the data deposit, please indicate which datafiles are raw data (i.e. data that indicate the original data collection process such as questionnaires) and which are processed data (i.e. data ready for analysis in publications) – both are needed eventually, but raw data files are essential for any completion report.

Raw data may contain personal identifiers, and therefore must be stored in "restricted access". If the data contains sensitive, confidential or restricted data per the HKU Policy on Research Ethics, the researcher may, at his or her choice, wish to further make a version that anonymizes the data, for public access (with the approval of relevant IRBs or ethics committees), to show in open access.

Essential,
  • Data Management Plan (DMP)
  • Dataset(s) quantitative and/or qualitative, raw and/or processed,
  • Metadata about all the data files including file formats (please use open formats wherever possible), Code book (i.e. description of variables), etc.
  • Readme file, giving particulars of data
  • Grant funder, name of grant, and number
  • Publication(s) if any, DOI, etc.
  • PI(s) and Co-I(s), identifiers (ORICD or Hub ID)
If data includes personal data,the data should be put under restricted access,
  • Personal data from clinical research (i.e. Institutional Review Board (IRB) approved)
    • provide approval code, consent forms, ethical application form when available, please state the risk of re-identification from the different datafiles and how the risk has been minimised for any dataset intended for sharing.
  • Personal data from non-clinical research (i.e. Human Research Ethics Committee (HREC) approved)
    • provide approval code, consent forms, ethical application form, please state the risk of re-identification from the different datafiles and how the risk has been minimised for any dataset intended for sharing.
If data includes interviews,
  • Interview transcripts
  • Blank questionnaire & interviewer guidelines
If field research data,
  • provide copy of file research notebook in digital format, preferably machine readable.
If lab research data,
  • copy of working papers and/or lab research notebooks in digital format, preferably machine readable.
If simulated data,
  • how was it generated? Please either explain or provide a link.
If other types of data, such as Image or video data, Creative or Design data,
  • please explain what type of data and how was it collected/generated.
If softwareis needed to read or analyze any of the datafiles,
  • please provide full details of software name, version needed, and any instructions necessary to obtain the software. If you have written your own script for analyzing the data, please include this script also in final deposit.
When ready,
  • final project reports and publications

The HKU Scholars Hub

HKU has designated the Hub to be long term storage and preservation of HKU research data. HKU researchers and research post-graduate students may deposit datasets to the Hub. These may be replicating data sets for journal articles, theses, or they may be standalone datasets produced by research projects. The Hub will accept datasets that are static and not to be changed again, i.e., after the research project has finished. However it will accept new "versions" of datasets, while retaining the metadata item record of the previous version. The Hub can store data in "dark archive" in perpetuity, or, with an embargo period of no access for a duration of your choice.

Datasets will be linked to relevant authors, publications, patents, and grants, thereby raising visibility and the chance for discovery of all linked items. The Hub will mint a DOI from CrossRef, to make your dataset more easily citable. Files can be uploaded separately or zipped into one and loaded. Please use this page to submit a deposit if less than 5 GB. If more, please write to hub@lib.hku.hk for other options.

Other Options, External

There are many options for data storage external to HKU. If your research team includes PIs/CO-Is from other institutions, those institutions may have recommendations or requirements for where to store and how to share research data from your research project. Your funder(s) and intended journal(s) for publication may have similar. The DCC has produced a guide to evaluating repositories for long-term data preservation. Some options below.  If you do deposit to an external repository, please also deposit the data, or the metadata only, in the HKU Scholars Hub.

  • The Registry of Research Data Repositories (re3data.org) lists 1,500 research data repositories. It can be searched by subject, content type, and country, and filtered by many more categories.
  • Open Access Directory's list of Disciplinary Repositories .
  • University of Minnesota Libraries webpage lists some of the more popular subject specific repositories by subject area.
  • Figshare will allow individuals to freely deposit up to 20 GB (single file size limit of 5 GB) of data, create a collaborative work space with other members of your research team, and mint a DOI for citing your dataset.
  • Dryad welcomes data files associated with any published article in the sciences or medicine, as well as software scripts and other files important to the article.
  • Zenodo