Is AI Compatible with Privacy Principles?
This seven-part series explores, from a Canadian perspective, options for effective privacy regulation in an AI context.
Many experts on privacy and artificial intelligence (AI) have questioned whether AI technologies such as machine learning, predictive analytics, and deep learning are compatible with basic privacy principles. It is not difficult to see why; while privacy is primarily concerned with restricting the collection, use, retention and sharing of personal information, AI is all about linking and analyzing massive volumes of data in order to discover new information.
The Office of the Privacy Commissioner (OPC) of Canada recently stated that, “AI presents fundamental challenges to all foundational privacy principles as formulated in PIPEDA [Canada’s Personal Information Protection and Electronic Documents Act].” The OPC notes that AI systems require large amounts of data to train and test algorithms, and that this conflicts with the principle of limiting collection of personal data.  In addition, organizations that use AI often do not know ahead of time how they will use data or what insights they will find. This certainly appears to contradict the PIPEDA principles of identifying the purposes of data collection in advance (purpose specification), and collecting, using, retaining, and sharing data only for these purposes (data minimization).
So, is it realistic to expect that AI systems respect the privacy principles of purpose specification and data minimization?
I will begin by stating clearly that I believe that people have the right to control their personal data. To abandon the principles of purpose specification and data minimization would be to allow organizations to collect, use, and share personal data for their own purposes, without individuals’ informed consent. These principles are at the core of any definition of privacy, and must be protected. Doing so in an AI context, however, will require creative new approaches to data governance.
I have two suggestions towards implementing purpose specification and data minimization in an AI context:
- Require internal and third-party auditing
Data minimization – the restriction of data collection, use, retention and disclosure to specified purposes – can be enforced by adding to legal requirements regular internal auditing and third-party auditability.
As currently formulated, the Ten Fair Information Principles upon which PIPEDA is based do not specifically include auditing and auditability. The first principle, Accountability, should be amended to include requirements for auditing and auditability. Any company utilizing AI technologies – machine learning, predictive analytics, and deep learning – should be required to perform technical audits to ensure that all data collection, retention, use, and disclosure complies with privacy principles. AI systems should be designed in such a way that third party auditors can perform white box assessments to verify compliance.
2. Tie accountability to purpose of collection
The core of the concept of data minimization is that personal data should only be collected for purposes specified at the time of collection, to which data subjects have given consent. While in AI contexts, data is increasingly unstructured and more likely to be used and shared for multiple purposes, data use and disclosure can still be limited to specified purposes. Data minimization can be enforced by implementing purpose-based systems that link data to specific purposes and capture event sequences – that is, the internal uses of the data in question.
To that end, I suggest the following:
i) Canadian privacy law very clearly states that the collection, retention, use, and disclosure of personal data must be for a specified purpose. As I mentioned above, the fair information principle of accountability should be revised to require audits that demonstrate that all collection, use, retention and disclosure is tied to a specified purpose, and otherwise complies with all other fair information principles.
ii) Organizations should be required to prove and document that the sequences of events involved in data processing are tied to a specified purpose.
To continue with the example from my previous post on legislating AI:
“As part of our partnership with Aeroplan, we may share the data we collect on you with Aeroplan, including your demographic data (your age and address, for example), and the frequency of your visits to our various club locations.
Aeroplan will provide us with information about you, including your income class metrics (your approximate gross earnings per year, and the band of your gross annual earnings) and information regarding your online activities and affinities; for example, your preferred gas station brand and favourite online stores, combined with the volume of your purchases.”
AI will require new approaches to enforcing the data protection principles of data minimization and purpose specification. While AI systems have the capacity greatly to increase the scope of data collection, use, retention and sharing, they also have the capacity to track the purposes of these data processing activities. Maintaining the link between data and specified purposes is the key to enforcing privacy principles in a big data environment.
 Office of the Privacy Commissioner of Canada, Consultation on the OPC’s Proposals for ensuring appropriate regulation of artificial intelligence, 2020.
 Centre for Information Policy Leadership, First Report: Artificial Intelligence and Data Protection in Tension, Oct 2018, pg. 12-13. The Office of the Victorian Information Commissioner, Artificial Intelligence and Privacy, 2018.
 The Office of the Victorian Information Commissioner, Artificial Intelligence and Privacy, 2018. See blog post from lawyer, Doug Garnett, AI & Big Data Question: What happened to the distinction between primary and secondary research? Mar 22 2019.