Sensitive Research Data is information that must be safeguarded against unwarranted access or disclosure, and may include:
Sensitive data requires careful handling and protection, and often is not suitable for sharing. However, there may be ways to share sensitive data legally and ethnically such as anonymizing, aggregating or restricting access to the data. The FRDR Controlled Access Management Program is an example of a Canadian project for sharing sensitive research data.
The sensitive data toolkit helps you identify the risk level of your data, and how to manage it throughout the research lifecycle, based on the risk level. It also offers sample language for informed consent.
Sensitive Data Toolkit for Researchers Part 2: Human Participant Research Data Risk Matrix
As a university dedicated to Indigenous post-secondary education, we want to make special note on data sovereignty for Indigenous Data. We recognize that the First Nations Information Governance Center’s principles of OCAP (Ownership, Control, Access and Possession) of research data collection is owned and controlled by First Nations. (https://fnigc.ca/ocap-training/).
The Tri Agencies have specific language in the new Data Management Policy, in relation to Indigenous Data. CBU recommends that researchers follow these practices for all research by and with First Nations, Metis and Inuit communities, collectives and organizations as well as the TCPS 2, chapter 9: Research involving the First Nations, Inuit and Metis Peoples of Canada.
Regarding Data Management plans: “For research conducted by and with First Nations, Métis and Inuit communities, collectives and organizations, DMPs must be co-developed with these communities, collectives and organizations, in accordance with RDM principles or DMP formats that they accept. DMPs in the context of research by and with First Nations, Métis and Inuit communities, collectives and organizations should recognize Indigenous data sovereignty and include options for renegotiation of the DMP.” (https://www.science.gc.ca/eic/site/063.nsf/eng/h_97610.html )
De-identification is the process of removing or modifying information that might be used to identify someone in a dataset. By doing this researchers can share their data without disclosing sensitive information. However de-identification is not foolproof, there is always a possibility for identification, so researchers need to be aware of the risks and challenges involved.
Methods of de-identification (reused from UBC Library)
Method of de-identification | Description | Pros ![]() |
Cons ![]() |
---|---|---|---|
Anonymization | the most strict form where all identifying information is removed from the dataset and cannot be restored. | ensures a high level of privacy protection | may reduce the usefulness and quality of the data |
Pseudonymization | identifying information is replaced with artificial identifiers, such as codes or numbers | allows the data to be linked across different sources/datasets or over time | increases the risk of re-identification if the codes are exposed or cracked |
Aggregation | individual data points are grouped together into categories or ranges | preserves some statistical properties and patterns | reduces the level of detail and variability in the data |
Masking | identifying information is hidden or obscured by using techniques such as encryption, hashing, blurring, or noise addition | makes the data harder to read or interpret | introduces errors or distortions in the data |
Generalization | identifying information is replaced with more general or vague terms. For example, dates can be replaced with years, addresses can be replaced with regions, or names can be replaced with initials | preserves some semantic meaning and context | makes the data less specific and more ambiguous |
Facebook Twitter Instagram