This post is part of a series on privacy-preserving federated learning. The series is a collaboration between the Responsible Technology Adoption Unit (RTA) and the US National Institute of Standards and Technology (NIST). Learn more and read all the posts published to date on our blog or at NIST’s Privacy Engineering Collaboration Space
Introduction
This guest post features reflections from the UK’s Information Commissioner’s Office (ICO). The ICO is the UK regulator of data protection and other information rights legislation. During the UK-US PETs Prize Challenges, the ICO Innovation Hub mentored participating UK teams, assisting them with understanding how their solutions might interact with UK data protection regulations if applied to real data.
Key Considerations for Implementing Federated Learning
Privacy enhancing technologies (PETs) like federated learning can help your organisation minimise the amount of personal information you use, protect personal information and demonstrate that you are embedding data protection by design and by default.
PETs also open unprecedented opportunities to harness the power of personal information through innovative and trustworthy applications. For example, by allowing the sharing, linking and analysis of people’s personal information without having to access it.
But these technologies are not a silver bullet, and security and privacy risks can remain. Most PETs involve processing personal information. Your processing still needs to be lawful, fair and transparent and you must ensure that PETs you use are implemented appropriately.
The ICO’s PETs guidance sets out factors that organisations should consider when deploying these technologies. You should consider implementing PETs at the design phase of your project, particularly for projects that involve large volumes of information, especially special category information.
Three key themes emerged from our engagement with participants during the UK-US PETs Prize Challenges:
1. Data protection impact assessments (DPIAs) can help you to determine whether PETs can mitigate the risks to people
Processing activities involving artificial intelligence may require you to complete a DPIA. A DPIA will help you to identify risks to people’s personal information, determine their likelihood and severity, and consequently decide which PETs can mitigate them to an acceptable level. You can use or adapt the ICO’s sample DPIA template to carry out a DPIA, or create your own.
Federated learning does not prevent personal information from being shared through the output. However, it can be combined with output privacy approaches such as differential privacy to hide the use of people's personal information in training tasks.
You should assess the risks of federated learning indirectly exposing identifiable information used for local training of the machine learning model, and during the training process between multiple parties. For example, you should assess the risk of an attacker:
- Carrying out model inversion of the model updates or other attacks such as membership inference;
- observing the patterns that those models have identified (known as ‘gradients’);
- observing model changes over time;
- observing specific model updates (i.e. a single client update); or
- manipulating the model.
2. Combine federated learning with other PETs
Information shared as part of federated learning may indirectly expose identifiable information used for training machine learning models. If you identify this risk in your risk assessment, such as a DPIA, you should combine federated learning with other PETs.
Risks of failing to combine federated learning with other PETs include local machine learning models continuing to use personal information.
Combining federated learning with other PETs in order to mitigate the risks you identify will help you meet the data protection by design obligation under the accountability principle of the UK GDPR, which requires the protection of information across the entire solution lifecycle.
When considering other PETs to combine with federated learning, you should think about your aims and the maturity, scalability and cost of the technology. PETs you could use include:
- Secure communications protocols (e.g. TLS)
The ICO’s PETs guidance discusses these PETs in more detail and is a great resource to demonstrate how PETs can help with data protection compliance.
3. Assess the identifiability risk
You should assess the risk of identifying people at each stage of the data lifecycle using a motivated intruder test. Consider the means all parties involved could use to identify people from the training data or during the model training stage. You should consider using appropriate input privacy PETs to mitigate this risk.
If your aim is for your output to be effectively anonymised, differential privacy can be used to mitigate the risk of re-identification by anonymising outputs. Differential privacy can be used to add noise and hide the use of a particular person’s information in a training task, provided an appropriate amount of noise is added and the privacy budget is tuned effectively. However, you shouldn’t automatically assume that it will.
The ICO’s Innovation Services provide free support to organisations ranging from start-ups to large organisations who are running new and novel projects using personal information. If you’re interested in working with one of our Innovation Services, you can find contact details on our website.
Coming Up Next
The next post in this DSIT-NIST blog series will conclude this series, with a look ahead at future opportunities around working with PPFL.
Leave a comment