Bringing Privacy Regulation into an AI World, Part 6
In my last few posts, I have discussed the crucial difference between access control and use control as tools for protecting privacy in an AI context. Access control has been highly effective in managing security threats – so much so, in fact, that external hacks are rare. Recent major data breaches have been characterized by the involvement of an insider, bought by outside interests, or a compromised third party. Access control regulates who can access personal data, but not what they can do with it. Use control, on the other hand, would prevent insider-linked data leaks by regulating both unauthorized and authorized access to information. Use control regulates how personal data can be used, ensuring that it is analyzed or shared only for specified purposes. Implemented effectively, this offers the possibility of regulating data analytics, which have so far defeated efforts to enforce privacy principles. Use control could be the tool needed to police the new frontier of privacy and security.
Say that a privacy engineer is setting up a use control system for a pharmacy chain’s database. The pharmacy wants to use data analytics to improve its distribution system by predicting purchasing trends. The intent is to analyze aggregate customer purchase data, without using identifying information such as credit card numbers. One of the major privacy risks of data analytics systems, though, is that they are designed to link data; combining purportedly anonymized information from multiple sources can result in re-identifying individuals. To minimize privacy risk, certain context-specific restrictions could be built in. Filters can block access to sensitive information such as health conditions; in this way, when Jean goes to the pharmacy, his purchases of grocery and stationery items could be used in data analytics, but not his prescription medications. In order to avoid results that discriminate based on gender, ethnicity, race or age, the database could be configured to block column-based searches that show these characteristics. To avoid any results that identify a small group of people with unique characteristics, filters can prevent any query producing a result of less than 500 people.
When it comes to privacy audits, use control offers a major benefit. Auditing compliance with privacy norms in an access control system is a considerable challenge. If someone claims that there’s been an unauthorized access of their personal information, the only recourse is to audit all data transactions involving their data – an immensely time-consuming task. With use control, the database is designed to enable data analytics while blocking unauthorized access and use, as the pharmacy example illustrates. To audit compliance, there is no need to look at the data; all you need to do is to review the database controls, to see whether or not they would allow the suspected unauthorized access.
Faced with a data breach in an access control system, an auditor would need to review transactional data to confirm a wrongdoing. The auditor would ask: is there a trace of Jorge committing transactions A, B, and E, and in this sequence? (In the case of a banking review by a securities commission, for example, this could involve millions of transactions from multiple banks.) In a use control review, the auditor would simply examine the system policies or rules. The query would be: does Jorge have individualized access to data, and do the rules allow him to uncover a single individual’s identity, or a certain property? If he does not need this access for his work, he should not be able to view this data. If he does need individualized access to data to complete his work, the system can be structured to limit this access to the specific datasets and fields he needs. Thus, use control can bring data use practices more in line with the legal requirements of privacy protection.
Use control deserves further study and development. We need a conceptual model, including a canonical set of instructions that define a use control vocabulary – the technical language to describe and implement database controls. Ideally, such language would be crafted with enough clarity that the same terminology could be used across the board in legal, technical, and practical contexts: by lawmakers, privacy commissioners, privacy engineers, and data protection officers.
I believe use control systems can be the next generation of data protection. Use control has the potential to protect data currently at risk, and to ensure that AI and data processing can flourish alongside robust privacy rights. By promoting a culture of transparency in database use and design, and by defining a privacy language that can be used uniformly across all the diverse aspects of the privacy field, use control can strengthen and simplify data privacy and security.
Bringing Privacy Regulation into an AI World:
PART ONE: Do We Need to Legislate AI?
PART TWO: Is AI Compatible with Privacy Principles?
PART THREE: Big data’s big privacy leak – metadata and data lakes
PART FOUR: Access control in a big data context
PART FIVE: Moving from access control to use control
PART SIX: Implementing use control – the next generation of data protection
PART SEVEN: Why Canadian privacy enforcement needs teeth