Private Matters: The Role of Anonymization and Deidentification in Ethical Data Practices

By Arundati Dandapani, MLitt, CAIP, CIPP/C, CIPM, Founder and CEO, Generation1.ca and Professor at Humber College Institute of Technology and Advanced Learning, Canada,  Certification Advisory Board Member, IAPP, Global, Vice Chair, Algonquin College Marketing Research Analysis Program School of Business, Canada.  

This article highlights the critical role of privacy in shaping the skills and expertise required for excelling as top-tier data and AI leaders.

Privacy in the AI Era: The Challenge Brands Cannot Ignore

When you speak with any major brand or leading public institution focused on data today, privacy emerges as the top challenge they face. And right behind it? AI governance. According to the latest results of an International Association of Privacy Professionals (IAPP)-EY survey, data governance is rapidly evolving, with privacy at the top. Their research reveals that roughly 60% of privacy departments have already taken on additional AI governance responsibilities, underscoring as well how AI has turbocharged privacy concerns. 

In an era dominated by AI, managing privacy is more complex than ever. As our data inventories expand and diversify with the adoption of new technologies, tools and techniques, the pressure on upholding privacy regulations and standards intensifies. Privacy isn’t just a compliance checkbox, but a challenge every organization must navigate with precision and foresight.

Navigating the High Stakes of Personally Identifiable Information (PII) 

Personal information is at the heart of every privacy regulation, and its proper handling is crucial for maintaining trust with research participants. As organizations rely on good data leadership, the responsibility to protect respondents and subsequently client trust becomes even more critical, especially when working with third parties.

The growing value of human data has made it important to explore anonymization and deidentification techniques among strategies for balancing data utility with privacy in both business and society. I speak more about this in a multicultural immigrant research and then again in a broader industry context at ESOMAR’s 2024 annual Congress in Greece as well.

But what happens when you anonymize or de-identify personal information? Let’s dive into some implications of these data protection practices and see what they might mean for your business’ growth, reputation and social impact.

The Current Landscape of Anonymization and Deidentification Techniques

Anonymization modifies data so it can’t be traced back to a specific person, while deidentification removes identifying features. The key difference lies in the risk of re-identification—ideally, anonymization eliminates this risk, whereas deidentification may still carry some risk. This is why anonymized data does not, in theory, sit on the privacy regime. However, jurisdictions have defined their approach in generally ambiguous terms to users. Privacy reform should tackle such ambiguities with improved guidance on definitions, best practices and standards.   

In Canada, where I reside, PIPEDA, the federal private sector law describes personal information as data about an identifiable individual, and it governs all data within its regime by this definition. But contrary to what one might think, anonymized data doesn’t clearly fall under PIPEDA’s scope because it lacks clear standards or definitions on how to anonymize, deidentify, or pseudonymize personal information. Similar gaps exist in Alberta and British Columbia’s legislation. The upcoming Bill C-27 aims to establish these standards and definitions, although as critics argue, not as clearly as its predecessor Bill C-11.

Quebec’s Bill 64, sometimes believed to be stricter than the EU’s GDPR, defines de-identified and anonymized data, yet questions remain. Can anonymization or de-identification legally replace data destruction after its intended use? Upcoming privacy reform could address these issues, though not without sparking debate about whether anonymized data should remain outside the privacy regime amid unclear “best practices”. In the US, the CCPA sets definitions for deidentified, pseudonymized and aggregated data, each a technique for reducing (and not destroying) the identifiability of individuals in a dataset. In Europe, anonymized information does not fall under the scope of EU’s GDPR. 

Use Cases for Anonymization and Deidentification

The US Census Bureau has leveraged high-fidelity synthetic data built on its linked underlying dataset that combines real-world census data with administrative tax and benefit data. Although it has been leveraging this technology since 1993, for the 2020 US Census, the Bureau also incorporated ‘differential privacy,’ an advanced technique that adds controlled ‘noise’ to the dataset, allowing agencies to precisely quantify the likelihood of identifying an individual while further minimizing this risk. Synthetic data, also termed “non-disclosive” data, are privacy preserving

In 2021, Health City and the Institute of Health Economics (IHE) partnered with Merck Canada, Alberta Innovates, and the University of Alberta in Canada to explore synthetic health data’s potential in improving patient outcomes without linking to individuals. The application of synthetic data in clinical trials here enabled rapid, secure data sharing for AI and machine learning, potentially driving economic growth and attracting multinational investments to Alberta.

Other Emerging Privacy Enhancing Techniques and Technologies

Advanced anonymization techniques are disrupting how we protect personal data. Approaches like differential privacy, federated learning and synthetic data generation offer innovative ways to anonymize data while preserving its utility. The global federated learning market is projected to grow at a CAGR of 10.5% by 2032 to be valued at $311 million or nearly three times its size in 2023. AI and machine learning are playing a crucial role in enhancing deidentification processes, enabling more sophisticated and scalable solutions that maintain the balance between privacy and data usefulness.

Blockchain is also a powerful tool for privacy-preserving data sharing with a decentralized and secure framework that could transform the way sensitive information is exchanged and managed. All these technologies are hailing new standards for privacy in the digital age. Preserving privacy empowers users. Ethical considerations must always guide such trust and privacy preserving activities, and must be prioritized, especially when dealing with underserved or sensitive and vulnerable populations. 

Data Privacy is Highly Interdisciplinary: Join a Global Movement to Upskill and Govern 

Data privacy demands diverse skillsets and expertise across law, regulation, technology, science, business, and ethics. As AI governance surges in priority for global data organizations, we need relevant skills to advance anonymization and deidentification techniques for ethical decision-making for business and social impact sustainably. Be on the lookout for the 2025 wave of Generation1.ca’s Global Industry Skills Study to see how all this impacts skills of the future. Also, if you are looking for top data and AI talent, come join our Virtual Insights Career Fair and Case Competition on September 27

Share this post:
Share this post
Recent Posts
Categories

Subscribe to our Newsletter


By submitting this form, you are consenting to receive marketing emails from: . You can revoke your consent to receive emails at any time by using the SafeUnsubscribe® link, found at the bottom of every email. Emails are serviced by Constant Contact