Over the past decade, privacy has become an increasing concern for the public as data analytics have expanded exponentially in scope. Big data has become a part of our everyday lives in ways that most people are not fully aware of and don’t understand. Governments are struggling to keep up with the pace of innovation and figure out how to regulate a big data sector that supersedes national borders.
Different jurisdictions have taken different approaches to privacy regulation in the new context of big data, machine learning, and artificial intelligence (AI). The European Union is in the lead, having updated its privacy legislation, established a “digital single market” across Europe, and resourced a strong enforcement system. In the United States, privacy remains governed by a patchwork of federal and state legislation, largely sector-specific and often referencing outdated technologies. The US Federal Trade Commission is powerful and assertive in punishing corporations that fail to protect data from theft, but has rarely attempted to regulate the big data market. Canada’s principle-based privacy legislation remains relevant, but the Office of the Privacy Commissioner (OPC) acknowledged recently that “PIPEDA [the Personal Information Protection and Electronic Documents Act] falls short in its application to AI systems.” As the OPC states, AI creates new privacy risks with serious human rights implications, including automated bias and discrimination. Given the pace of technological innovation, there may not be much time left to establish a “human-centered approach to AI.”
This series will explore, from a Canadian perspective, options for effective privacy regulation in an AI context. I will discuss the following topics:
Do we need to legislate AI?
Are privacy principles compatible with AI?
Big data’s big privacy leak – metadata and data lakes
Access control in a big data context
Moving from access control to use control
Implementing use control – the next generation of data protection
Bringing Privacy Regulation into an AI World, Part 1
This seven-part series explores, from a Canadian perspective, options for effective privacy regulation in an AI context.
In recent years, artificial intelligence (AI) and big data have subtly changed many aspects of our daily lives in ways that we may not yet fully understand. As the Office of the Privacy Commissioner (OPC) of Canada states, “AI has great potential in improving public and private services, and has helped spur new advances in the medical and energy sectors among others.” The same source, however, notes that AI has created new privacy risks with serious human rights implications, including automated bias and discrimination. Most countries’ privacy legislation, including Canada’s, was not written with these technologies in mind. The privacy principles on which Canada’s privacy laws are based remain relevant, but are sometimes difficult to apply to a complex new situation. Given these difficulties, the OPC has questioned whether Canada should consider defining AI in law and creating specific rules to govern it.
I would argue, in contrast, that the technological neutrality of Canada’s privacy legislation is the reason it has aged better than laws in other jurisdictions, notably the US, that reference specific technologies. The European Union’s recently updated and exemplary privacy legislation deliberately takes a principle-based approach rather than focusing on particular technologies.
I thoroughly support the principles of technological neutrality, and I do not recommend the creation of specific rules for AI. Technologies are ephemeral and volatile, and they change rapidly over time; what the technological concept of “the cloud” meant ten years ago is very different from what exists today. AI is evolving all the time. Creating a legal definition of AI would make privacy legislation hard to draft, and harder to adjudicate. Doing so could easily turn any court case on privacy and AI into a fist-fight between expert witnesses offering competing interpretations.
AI adds a new element to the classic data lifecycle of collection, use, retention and disclosure: the creation of new information through linking or inference. For privacy principles to be upheld, data created by AI processes needs to be subject to the same limitations of use, retention and disclosure as the raw personal data from which it was generated. It is important to note that, conceptually, AI is not a form of data processing; rather, it is a form of collection. AI’s importance in the privacy domain lies in its impact – which is that it expands on the data collected directly from individuals.
Alex is the client of a robotics club. She provided her personal information on sign-up. The club, which has various locations in different cities, offers its patrons a mobile app to locate the nearest venue; Alex signed up to the app. The club’s AI analytics systems can track the stops that Alex makes en route to the club, and infer her preferred stopping places – the library, a tea shop, a gas station.
The robotics club has a café, and wants to know what its patrons like, so they can serve it. The club’s data processing has ascertained that Alex stops frequently at a tea shop on the way to the club; it infers that she likes tea. People share ideas and book recommendations through the app, and the club also makes recommendations to patrons. Alex has recommended Pride and Prejudice; the club’s AI infers that she would also enjoy Jane Eyre, and recommends it to her. The club’s AI system also searches her public Facebook posts to analyze her interests and recommend other books and products she might like.
AI systems go far beyond analyzing data that individuals have voluntarily provided. They frequently collect data indirectly, for example, by collecting public social media posts without individuals’ knowledge or consent. Through linking and inference, AI uses data from various sources to create new data, almost always without the consent of data subjects. This creation of knowledge is a form of data collection. If regulation can deal with privacy issues at the level of collection, it has also dealt with use, since collection is the gateway to use.
Therefore, I recommend changing the legal definition of data collection to include the creation of data through linking or inference, as well as indirect collection. Under Canada’s Personal Information Protection and Electronic Documents Act (PIPEDA), organizations may only collect personal data for purposes identified to individuals before or at the time of collection. Defining the creation of data as form of data collection would mean that information could be created through AI analytics only for specified purposes to which data subjects have consented.
To summarize, I do not believe that it is advisable or necessary to create specific legislation to govern privacy in the context of AI systems. The creation of new information through data analytics can be governed effectively by the same principles that govern the direct collection of personal data.
Bringing Privacy Regulation into an AI World, Part 2
This seven-part series explores, from a Canadian perspective, options for effective privacy regulation in an AI context.
Many experts on privacy and artificial intelligence (AI) have questioned whether AI technologies such as machine learning, predictive analytics, and deep learning are compatible with basic privacy principles. It is not difficult to see why; while privacy is primarily concerned with restricting the collection, use, retention and sharing of personal information, AI is all about linking and analyzing massive volumes of data in order to discover new information.
The Office of the Privacy Commissioner (OPC) of Canada recently stated that, “AI presents fundamental challenges to all foundational privacy principles as formulated in PIPEDA [Canada’s Personal Information Protection and Electronic Documents Act].” The OPC notes that AI systems require large amounts of data to train and test algorithms, and that this conflicts with the principle of limiting collection of personal data.  In addition, organizations that use AI often do not know ahead of time how they will use data or what insights they will find. This certainly appears to contradict the PIPEDA principles of identifying the purposes of data collection in advance (purpose specification), and collecting, using, retaining, and sharing data only for these purposes (data minimization).
So, is it realistic to expect that AI systems respect the privacy principles of purpose specification and data minimization?
I will begin by stating clearly that I believe that people have the right to control their personal data. To abandon the principles of purpose specification and data minimization would be to allow organizations to collect, use, and share personal data for their own purposes, without individuals’ informed consent. These principles are at the core of any definition of privacy, and must be protected. Doing so in an AI context, however, will require creative new approaches to data governance.
I have two suggestions towards implementing purpose specification and data minimization in an AI context:
Require internal and third-party auditing
Data minimization – the restriction of data collection, use, retention and disclosure to specified purposes – can be enforced by adding to legal requirements regular internal auditing and third-party auditability.
As currently formulated, the Ten Fair Information Principles upon which PIPEDA is based do not specifically include auditing and auditability. The first principle, Accountability, should be amended to include requirements for auditing and auditability. Any company utilizing AI technologies – machine learning, predictive analytics, and deep learning – should be required to perform technical audits to ensure that all data collection, retention, use, and disclosure complies with privacy principles. AI systems should be designed in such a way that third party auditors can perform white box assessments to verify compliance.
2.Tie accountability to purpose of collection
The core of the concept of data minimization is that personal data should only be collected for purposes specified at the time of collection, to which data subjects have given consent. While in AI contexts, data is increasingly unstructured and more likely to be used and shared for multiple purposes, data use and disclosure can still be limited to specified purposes. Data minimization can be enforced by implementing purpose-based systems that link data to specific purposes and capture event sequences – that is, the internal uses of the data in question.
To that end, I suggest the following:
i) Canadian privacy law very clearly states that the collection, retention, use, and disclosure of personal data must be for a specified purpose. As I mentioned above, the fair information principle of accountability should be revised to require audits that demonstrate that all collection, use, retention and disclosure is tied to a specified purpose, and otherwise complies with all other fair information principles.
ii) Organizations should be required to prove and document that the sequences of events involved in data processing are tied to a specified purpose.
To continue with the example from my previous post on legislating AI:
“As part of our partnership with Aeroplan, we may share the data we collect on you with Aeroplan, including your demographic data (your age and address, for example), and the frequency of your visits to our various club locations.
Aeroplan will provide us with information about you, including your income class metrics (your approximate gross earnings per year, and the band of your gross annual earnings) and information regarding your online activities and affinities; for example, your preferred gas station brand and favourite online stores, combined with the volume of your purchases.”
AI will require new approaches to enforcing the data protection principles of data minimization and purpose specification. While AI systems have the capacity greatly to increase the scope of data collection, use, retention and sharing, they also have the capacity to track the purposes of these data processing activities. Maintaining the link between data and specified purposes is the key to enforcing privacy principles in a big data environment.
This article describes the issue of Police use of AI-based facial recognition technology, discusses why it poses a problem, describes the methodology of assessment, and proposes a solution
The CBC reported on March 3 that the federal privacy watchdog in Canada and three of its provincial counterparts will jointly investigate police use of facial-recognition technology supplied by US firm Clearview AI.
Privacy Commissioner Daniel Therrien will be joined in the probe by ombudsmen from British Columbia, Alberta, and Quebec.
Meanwhile, in Ontario, the Information and Privacy Commissioner has requested that any Ontario police service using Clearview AI’s tool stop doing so.
The Privacy Commissioners have acted following media reports raising concerns that the company is collecting and using personal information without consent.
The investigation will check whether the US technology company scrapes photos from the internet without consent. “Clearview can unearth items of personal information — including a person’s name, phone number, address or occupation — based on nothing more than a photo,” reported the CBC. Clearview AI is also under scrutiny in the US, where senators are querying whether its scraping of social media images puts it in violation of online child privacy laws.
In my opinion, there are three factors that could get Clearview AI, and its Canadian clients, in hot water. Here are the issues as I see them:
The second issue: Not providing evidence of a Privacy Impact Assessment. A Privacy Impact Assessment is used to measure the impact of a technology or updated business process on personal privacy. Governments at all levels go through these assessments when new tools are being introduced. It’s reasonable to expect that Canadian agencies, such as police services, would go through the federal government’s own Harmonized Privacy and Security Assessment before introducing a new technology.
The third issue: Jurisdiction. Transferring data about Canadians into the United States may be a violation of citizens’ privacy, especially if the data contains personal information. Certain provinces, including British Columbia and Nova Scotia, have explicit rules about preventing personal data from going south of the border.
How will Privacy Commissioners decide if this tool is acceptable?
The R v. Oakes four part test  will be used to assess the tool’s impact. This requires considering the “four part test” used by courts and legal advisors to ascertain whether a law or program can justifiably intrude upon privacy rights. The elements of this test: necessity, proportionality, effectiveness, and minimization. All four requirements must be met.
Necessity: There must be a clearly defined necessity for the use of the measure, in relation to a pressing societal concern (in other words, some substantial, imminent problem that the security measure seeks to treat);
Proportionality: The measure must be carefully targeted and suitably tailored, so as to be viewed as reasonably proportionate to the privacy (or any other rights) of the individual being curtailed;
Effectiveness: The measure must be shown to be empirically effective at treating the issue, and so clearly connected to solving the problem; and
Minimal intrusiveness: The measure must be the least invasive alternative available (in other words, all other less intrusive avenues of investigation have been exhausted).
My assessment of the use of Clearview AI’s technology from the Oakes Test perspective:
Necessity: Policing agencies will have no problem proving that looking for and identifying a suspect is necessary. However …
Proportionality: Identifying all individuals, and exposing their identities to a large group of people, is by no means proportional.
Effectiveness: The tool’s massive database might be effective in catching suspects; however, known criminals don’t usually have social media accounts.
Minimality: Mass data capturing and linking doesn’t appear to be a minimalistic approach.
The federal Privacy Commissioner publishes its methodology at this link.
Are there any solutions?
Yes, AI-based solutions are available. Here at KI Design, we are developing a vision application that allows policing agencies to watch surveillance videos with everyone blurred out except the person for whom they have surveillance warrant. For more information, reach out to us.
Disease Sentiment Analysis for diseases and pandemics
COVID-19 is a global concern. It is affecting communities with over 2500 deaths while the number of cases continues to climb. Canada has issued travel advisory level 3 ( avoid non-essential travel to several countries including China.
Thousands of health professionals continue to deal with the risk of infection. Some countries are shutting down schools, factories, to an outright travel ban. This is causing anger, frustration, and more importantly, fear.
Whereas solving the clinical challenge is the top priority, the public safety challenge should be a concern in case of further exasperation of the situation.
We built an AI tool to help raise awareness of the amount of negativity resulting from the corona virus
World/Country Health organizations continue to publish guidance and statistics on the number of infections, cases, and deaths. Whereas statistics are very important, analyzing sentiment, goes beyond the surface.
Using Artificial Intelligence KI Design can separate sentiment into Surprise, Disgust, Joy, Anger, Neutral, Sadness, Fear.
Fear and Anger are two strong sentiments that are often the determinant of public behavior.
Duration : Jan 1st, 2019 - Mar 3rd, 2020
# of posts: 553,216,088
Sources: Twitter 79% Forums 10% News 6% other 5%
KI Design released an AI powered tool https://outbreaks.info that analyses world sentiment on Corona Virus COVID19
The first graph shows the overall data captured related to various communicable diseases.
Looking at the basic sentiment bar chart, by comparing negative and positive sentiment over time one can notice that there has been an increase in relative negativity. The relative amount of fear is also increasing as shown in this chart.
The tool can also present information about volume, which when combined with sentiment should give us an idea about the potential impact.
Here are my unfiltered answers to questions surrounding AI.
Beyond the Jargon – what problems can AI solve for me?
AI can solve problems where a judgement call is needed. In essence it is suited to contexts where there is an opinion or an attribute need be determined based on patterns. A prime example is sentiment analysis. If an airport wants to learn what people think of its services, they can train an AI engine to detect sentiment. Using AI, here is an example of how sentiment can be compared between two airports Vancouver International Airport @yvrairport and Toronto Pearson International Airport @TorontoPearson .
Are there any successful AI implementation Models?
In terms of models, AI solutions are not cookie cutter. A model has to be built based on a particular project premises. There may be some lessons learned, but the models are generally not recyclable.
What are a few steps that can ensure success implementing AI?
Its always important to think of the need or the opportunity that we need to solve.
Its important to think of AI as a means to an end, not the other way round
What is an Applicable AI Scenario?
Scenario: People needing to commute before flying are generally unhappy. How can we make their journey or experience better.
Step 1: Identify a problem or an opportunity. In this case our goal is to improve user experience by identifying their pain points. e.g. customer who commute to fly have a negative sentiment towards the airport.
Step 2: Interview Stakeholders ask them about their possible solutions. Its extremely important to ask the stakeholders involved for their assumptions. Suggestions from stakeholders could be: Provide parking assist solution, better signage and public transit information, a mobile application that provides a holistic experience.
Step 3: Use Big Data (Social/Sales/Flight/Traffic/Weather) to understand sentiment. The public is generally vocal about their experiences in the public services domain. Your lead scientist can design an approach to assess sentiment and to extrapolate reasons for negative sentiment. This step ought to include quantitative and a qualitative data analysis.
A) Quantitative data analysis will identify top 10 challenges statistically. for example, unavailability of parking spaces could attribute to negative sentiment.
Using quantitative analysis a scientist can extract the top 10 reasons for negative sentiment.
B) Qualitative analysis is also needed. In this step, a human can train the AI engine to categorize challenges met by travelers. The AI can then sift through data at mass and provide more refined stats.
Using qualitative analysis a team of analysis can understand and breakdown user stories into patterns. Once the patterns are defined, the AI can quickly categories all data into patterns .
Step 4: Compare assumptions with data. Comparing assumptions to data findings is key. Assumptions made by stakeholders may remain valid despite the lack of data support. The gap could be due to access or data quality.
Step 5 – Implement solution and remeasure in 6 months or when appropriate: Sentiment may change in the order of days or weeks. Once a solution is chosen and implemented. Sentiment results will start to change.
Can AI be applied to all industries?
Indeed, here is another example how AI has been used to detect cell phone user sentiment in Canada. among @bell, @rogers, & @telus
Why is there a lot of confusion of what AI is?
Most people explaining AI, may lack the technical and scientific rigor. I am told multi-billion budget companies are failing at the basics.
On September 13, Dr. Waël Hassan, was a panelist at the Innovation Procurement Case Study Seminar on Smart Privacy Auditing, hosted by Mackenzie Innovation Institute (Mi2) and the Ontario Centres of Excellence (OCE). The seminar attracted leaders from the health care sector, the private information and technology industry, and privacy authorities. The seminar explored the concept of innovative procurement via the avenue of competitive dialogue, in addition to demonstrating the power and benefits of using artificial intelligence to automates the process of auditing all PHI accesses within a given hospital or health network.
What are the benefits of participating in an innovative procurement process, particularly competitive dialogue?
An innovative procurement partnership between Mi2, Mackenzie Health, Michael Garron Hospital, and Markham Stouffville Hospital was supported by the OCE’s REACH grant and sought to identify an innovative approach to auditing that could be applicable to the privacy challenges faced by numerous hospitals with different practices, policies, and information systems. Rather than focus on how the solution should operate, the partners collaboratively identified six outcome-based specifications the procured audit tool would be required to meet.
By identifying key priorities and specifying the outcomes a solution should achieve, Competitive Dialogue establishes a clear and mutual understanding of expectations. This can help the private sector narrow down solution options to a model best-suited for the contracting authority’s unique context. The feedback loop provided by the iterative rounds (if used) enables vendors to clarify any confusion and customize proposals to the contracting authority’s unique needs, staff workflows, and policy contexts.
Competitive Dialogue is an opportunity for transparent communication that gives vendors the opportunity to learn more intimate details of what the contracting authority, in this case Mackenzie Health, needs from a solution. Because hospitals are not tech or security experts, they often struggle to accurately identify and define what solutions they need to solve a particular issue, and thus a traditional procurement process is rarely ideal since there is little to no room for clarification or feedback. This process is more flexible than the traditional procurement process and thereby allows for more creativity and innovative thinking processes during the initial proposal development. Encouraging creativity and creating a competitive environment in which competing vendors may be sounding ideas off each other results in higher quality proposals and final solutions.
Mackenzie Health Case Study
Mackenzie Health employs over 450 physicians and 2,600 other staff members, processes nearly 55,000 patient medical record accesses every day, and has just one privacy officer to monitor everything. Mackenzie Health’s privacy needs far outweigh its capacity, so they turned to the private sector for an innovative solution.
Section 37(1) of PHIPA outlines the possible uses of personal health information, and these guidelines are based on the purpose underlying the activities. Because the legal framework is centred on purpose, KI Design’s approach is to explain the purpose for accessing a given medical record. The core of this technology is more commonly known as an explanation-based auditing system (EBAS) designed and patented by Dr. Fabbri of Maize Analytics.
To detect unauthorized accesses, the technology has the capability of identifying an intelligible connection between the patient and the employee accessing the patient’s records. AI changes the fundamental question underlying auditing tools from “who is accessing patient records without authorization?” to, “for what purpose are hospital staff accessing patient records?” Asking this question helps the technology break down staff workflows and identify common and unique purposes for accessing any given medical record, which are further categorized as either authorized access or unexplained access, which may then flagged as potentially unauthorized behaviour. The technology is able to filter out the authorized accesses, which are usually 98% to 99% of all accesses, so that the Privacy Officer can focus on the much smaller number of unexplained and flagged accesses.
Why is the private sector interested in health care?
Health care is an extremely complex system operated by the province and service providers. The province is a specialist in governance and regulation, the service providers are specialists in medicine – neither are experts in privacy or security. Companies such as KI Design are interested in filling the expertise gap within the health care sector by working closely in tandem with health care providers and the Information & Privacy Commissioner to adapt privacy and security solutions that are suitable for their working realities. There is irrevocable value added in having a privacy and security expert working directly with hospitals and other health service providers to assist in refining privacy best practices and implementing a privacy tool that will improve privacy and security outcomes without restricting the workflows of health practitioners.
To learn more on how AI solutions improve Audit visit https://phipa.ca/