“Before data obtained from research with people can be shared with other researchers or archived, you may need to anonymise them so that individuals, organisations or businesses cannot be identified… Re-users of data have the same legal and ethical obligation to NOT disclose confidential information as primary users. ” (UK Data Archive)
Procedures to anonymise data should always be considered alongside obtaining informed consent for data sharing or imposing access restrictions.
A person's identity can be disclosed from direct identifiers such as names, addresses, telephone numbers or pictures. These can be easily redacted. More problematic is disclosive data or indirect identifiers which, when linked to other publicly available information sources, could identify someone. These include information on workplace, occupation, salary, age, etc. In recent years re-identification of subjects by combing DNA sequences with publicly available sources has gained notoriety.
Anonymising research data can be time consuming and therefore costly. Early planning can help reduce the costs. The HKU Policy on Research Integrity describes ethical data collection and storage. In some cases the researcher may wish to store two copies of the data, the original held in dark archive with or without an embargo period, and other redacted or anonymized for sharing.