Edition: International | Greek

FOLLOW EBR MAGAZINE ON:

Home » Analyses

Why artificial intelligence design must prioritize data privacy

Data privacy is often linked with artificial intelligence (AI) models based on consumer data

By: EBR - Posted: Tuesday, April 5, 2022

Artificial intelligence is integral to developments in healthcare, technology, and other sectors, but there are concerns with how data privacy is regulated.

by Einaras von Gravrock*

Data privacy is often linked with artificial intelligence (AI) models based on consumer data. Understandably, users are wary about automated technologies that obtain and use their data, which may include sensitive information. As AI models depend on data quality to deliver salient results, their continued existence hinges on privacy protection being integral to their design.

More than just a way to dispel customers’ fears and concerns, good privacy and data management practices have a lot to do with the company’s core organizational values, business processes, and security management. Privacy issues have been extensively studied and publicized, and data from our privacy perception survey indicates that privacy protection is a crucial concern for consumers.

Addressing these concerns contextually is crucial, and for companies operating with consumer-facing AI, there are several methods and techniques that help solve privacy concerns often linked to artificial intelligence.

Some products and services need data, but they don’t need to invade anyone’s privacy

Companies working with artificial intelligence are already facing a disadvantage in the public’s eye in terms of privacy. According to the European Consumer Organization in 2020, a survey showed that 45-60% of Europeans agree that AI will lead to more abuse of personal data.

There are many popular online services and products that rely on large datasets to teach and improve their AI algorithms. Some of the data in those datasets might be considered private even by the least privacy-conscious users. Streams of data from networks, social media pages, mobile phones, and other devices contribute to the volume of information that businesses use to train machine learning systems. Thanks to overreaching personal data use and mismanagement by some companies, privacy protection is becoming a public policy issue around the world.

Much of our sensitive data is gathered to improve AI-enabled processes. A lot of the data analyzed is also driven by machine learning adoption, as sophisticated algorithms need to make decisions in real-time, based on those data sets. Search algorithms, voice assistants, and recommendation engines are just a few solutions that leverage AI based on large datasets of real-world user data.

Massive databases might encompass a wide range of data, and one of the most pressing problems is that this data could be personally identifiable and sensitive. In reality, teaching algorithms to make decisions does not rely on knowing who the data relates to. Therefore, companies behind such products should focus on making their datasets private, with few, if any, ways to identify users in the source data, as well as creating measures to remove edge cases from their algorithms to avoid reverse-engineering and identification.

The relationship between data privacy and artificial intelligence is quite nuanced. While some algorithms might unavoidably require private data, there are ways to use it in a lot more secure and non-invasive ways. The following methods are just some of the ways how companies using private data can become part of the solution.

Designing artificial intelligence with data privacy in mind

We have talked about the issue of reverse engineering, where bad actors discover vulnerabilities in AI models and discern potentially critical information from the model’s outputs. Reverse engineering is why changing and improving databases and learning data is vital for AI use in cases facing this challenge.

For instance, combining conflicting datasets in the machine learning process (adversarial learning) is a good option for distinguishing flaws and biases in the AI algorithm’s output. There are also options for using synthetic data sets that do not use actual personal data, yet their efficacy is still in question.

Healthcare is a leader in the governance around AI and data privacy, especially handling sensitive private data. It has also been doing a lot of work on consent, both for medical procedures or handling their data – the risks are high and have been legally enforced.

As for the overall design of AI products and algorithms, de-coupling data from users via anonymization and aggregation is key for any business using user data to train their AI models.

There are many considerations that can strengthen privacy protection in AI companies:

Privacy at the core: put privacy protection on the developer’s radar and find ways to reinforce security effectively

Anonymize and aggregate datasets, remove all personal identifiers and unique data points

Have strict control over who in the company has access to specific data sets and continuously audit how this data is accessed, as it has been the reason behind some data breaches in the past

More data is not always the best solution. Test your algorithms with minimized data to learn what is the least amount of data you need to gather and process that makes your use case viable

It is essential to provide a streamlined way to eliminate personal data at the user’s request. Companies that only pseudo-anonymize user data should then continuously retrain their models with the most up to date data

Leverage strong de-identification tactics, e.g., aggregated and synthetic datasets with full anonymization, non-reversible identifiers for algorithm training, auditing, and quality assurance, among others

Safeguard both the autonomy and privacy of users by rethinking ways of obtaining and using critical information from third parties – examine data sources closely and only use those that gather data with clear and informed user consent

Consider the risks: could an attack feasibly jeopardize user privacy from the outputs of your AI system?

What is the future of data privacy and AI?

AI systems need lots of data, and some top-rated online services and products could not work without personal data used to train their AI algorithms. Nevertheless, there are many ways to improve the acquisition, management, and use of data, including the algorithms themselves and the overall data management. Privacy-respecting AI requires privacy-respecting companies.

*Chief Executive Officer and Founder, CUJO AI
**first published in: www.weforum.org

READ ALSO

EU Actually

Ukraine may buy for 6 billion euros Chinese drones with EU money

By: N. Peter Kramer

In his weekly column, N. Peter Kramer writes about the Ukraine buying for 6 billion euros drones in China with EU money. Is it a threat for Taiwan?

Europe

Diplomacy intensifies over Cyprus

United Nations secretary-general envoy Maria Angela Holguin will be in Brussels on Wednesday for meetings seen as important to European Union preparations for a possible revival of Cyprus reunification talks ahead of an expected visit by Secretary-General Antonio Guterres to Cyprus next week.

Business

How Much Pressure Can European CEOs Take?

There was a time when the job of the CEO was difficult but relatively clear: grow the business, beat the competition, manage costs, satisfy shareholders, inspire employees and avoid major reputational mistakes. That world has disappeared.

Why artificial intelligence design must prioritize data privacy

READ ALSO

Ukraine may buy for 6 billion euros Chinese drones with EU money

Diplomacy intensifies over Cyprus

How Much Pressure Can European CEOs Take?

ARTICLES

DOSSIERS

RESOURCES

EB REVIEW

MAGAZINE