When collecting personal information, the GDPR principles state that organisations must act transparently and with consent, collecting data only for the explicit purpose it’s needed. They also put strong legal protection on sensitive, identifying information, often referred to as PII or Personal Identifying Information.
When GDPR was introduced, organisations were quick to meet this legislation’s requirements, with the threat of serious financial repercussions if they failed to do so. But there’s still more they can do to serve consumers ethically and responsibly during data collection.
Before you begin collecting data, have you considered..?
1) Getting consent to collect information
Seeking consent is the most appropriate way to legally collect information, while giving customers genuine control over their data.
While consent isn’t always required (such as in cases of legitimate interest and/or legal obligation), the GDPR suggests that consent be given to collect data for an explicit and stated purpose. Even without consent there still needs to be clear and comprehensive information provided about how personal information is used.
Unfortunately, some companies also resort to manipulative user agreements to get the consent they need, but it is not always consent a participant is happy to give. The value of consent is diminished when it becomes a condition of service.
Do you have permission from users or participants to collect their data?
Have they been made aware that their involvement is voluntary?
Is it clear that participants are free to withdraw from any active data collection programme at any point without pressure or fear of retaliation?
2) Protecting users’ confidentiality and anonymity when collecting data
Customers will often opt in to data collection under the assumption that the information collected remains confidential and any published findings are anonymised. If you do need to break confidentiality at any point (or suspect that you will do in future) then make it clear at the start of the process.
Where possible, avoid collecting personally identifiable information (PII). Good practice might be to design your data collection methods in a way that they can’t be reverse engineered to reveal subjects. However, it is also possible to identify people from merging separate datasets with just a few personal pieces of information about them.
Do you really need to collect PII at all?
If yes, have you taken steps to de-identify a dataset by removing all PII data before analysing or sharing the insights?
Have you considered how different data points could be used in conjunction to reverse engineer identity or identifying characteristics?
3) What do you intend to do with the data you’re collecting?
While it can be hard to know the purpose or value of data in advance — the GDPR supports the practice of purpose limitation. This means organisations shouldn’t operate with an intention of gathering as much as they can, to be used for an undefined purpose, at an undetermined point in the future. Additionally, there will be some information you cannot retain for more than 12 months.
Minimum viable collection is a strategy which relates to the issues of anonymity and intention. This method encourages organisations to only collect the data they absolutely need to ensure a result they want or a trend they aim to understand. This is sometimes referred to as the data minimisation principle.
In practice it can be difficult to implement, as it’s not always possible to know every purpose in advance. Being more responsible and trying to avoid this involves thinking critically about each data point you plan to collect.
How will this data contribute to my overall aim?
Could this data point be used in conjunction with others to reveal PPI?
What would the results mean for my overall predictions or aims?
Are you better at data ethics than Amazon’s Alexa?
There are endless examples of questionable personal data collection practices. As a case in point, Amazon received a rash of negative headlines following a 2021 lawsuit, which accused the company’s Alexa smart speaker of secretly collecting, and storing user data.
The extent to which Amazon processes data about its users for purposes such as personalisation has not always been clear. A paper published in 2022 by three US universities suggests Alexa collects sensitive voice and biometric data and shares the insights with as many as 41 ad partners.
Top tips for to improve your data collection practices
Only collect the minimum viable information you require for your intended results
This interactive tool may help with determining the lawful basis for gathering data
The next article in our series will focus on the ethics of data storage, with practical tips you can follow. Or you download the full report containing the full series below.
Our recent insights
FAIR data - what is it and why should you care?
One of our senior data consultants, Dr Alasdair Gray, explains what FAIR data is, who’s using it, why it’s so useful and some common misconceptions around it.
Do you know how to destroy your data securely?
In this final part of our data ethics series, we look at what data destruction is and how you can comply with GDPR required actions.
Discussions in data ethics: How to develop data ethics in local government
In the final part of the Discussions on Data Ethics series, Professor Paul Clough, TPXImpact and Lucy Knight, ODI, discuss data literacy, effective community involvement and the value of bad news.