ICO and other DPAs urge social media companies to protect their users from risks associated with third-party data scraping

Viewpoints
September 15, 2023
5 minutes

The UK Information Commissioner’s Office (ICO) and 11 other data protection and privacy authorities (DPAs) – under the aegis of the Global Privacy Assembly – has published a Joint statement on data scraping and data protection setting out what social media companies (SMCs) and operators of websites that host publicly accessible personal data should do to protect personal information on their platforms from unlawful data scraping.

The statement outlines the key privacy risks associated with data scraping and also recommends steps individuals can take to minimise privacy risks when sharing information online.

Data scraping

As the statement explains, data scraping generally involves the automated extraction of data from the internet, particularly from “SMCs and other websites” that host publicly accessible data. The capacity for data scraping technologies to collect and process vast amounts of personal data from the internet raises significant privacy concerns, even when the information being scraped is publicly accessible.

In most jurisdictions, personal information that is “publicly available”, “publicly accessible” or “of a public nature” on the internet, is subject to data protection and privacy laws and, whether the information is publicly accessible or not, such laws inevitably place obligations on SMCs and other websites not only with respect to the data they host, but also in relation to third-party scraping from their sites. Indeed, mass data scraping of personal data can constitute a reportable data breach in many jurisdictions.

Privacy risks

The joint statement says that SMCs and other websites should carefully consider the legality of different types of data scraping in the jurisdictions applicable to them and implement measures to protect against unlawful data scraping. DPAs have seen “increased reports” of mass data scraping from SMCs and other websites, raising a number of privacy concerns such as the use of such data for phishing and other targeted cyberattacks or identity fraud. Other risks include monitoring, profiling and surveilling individuals for example to populate facial recognition databases and provide unauthorised access to authorities; bulk unsolicited marketing; and the use of scraped data for unauthorised political or intelligence-gathering purposes.

Protecting individuals’ personal information from unlawful data scraping

The joint statement sets out what SMCs and other websites should be doing to mitigate the risks associated with unlawful data scraping. An approach based on multi-layered technical and procedural controls is recommended, as no one safeguard can adequately protect against all potential privacy harms associated with data scraping. A combination of these controls should be used that is proportionate to the sensitivity of the information. The statement identifies the following examples of appropriate measures:

  • Designating a team and/or specific roles to identify and implement controls to protect against, monitor for, and respond to scraping activities.
  • Limiting the rate or number of visits per hour or day by one account to other account profiles, and limiting access if unusual activity is detected.
  • Monitoring how quickly and aggressively a new account starts looking for other users. If abnormally high activity is detected, this could be indicative of unacceptable usage.
  • Taking steps to detect scrapers by identifying patterns in bot activity. For example, a group of suspicious IP addresses can be detected by monitoring from where a platform is being accessed by using the same credentials from multiple locations. This would be suspicious where these accesses are occurring within a short period of time.
  • Taking steps to detect bots, such as by using CAPTCHAs, and blocking the IP address where data scraping activity is identified.
  • Where data scraping is suspected and/or confirmed, taking appropriate legal action such as the sending of “cease and desist” letters, and other legal action to enforce terms and conditions prohibiting data scraping.
  • In jurisdictions where the data scraping may constitute a data breach, notifying affected individuals and privacy regulators as required.

The statement also recommends proactively supporting users so that they can make informed decisions about how they use the platform and what personal information they share, in particular by increasing user awareness and understanding of the privacy settings. Given the dynamic nature of data scraping threats, any controls should be routinely stress-tested and kept up-to-date.

Comment

As the statement also recognises, any safeguards platforms introduce will only be properly effective with user engagement. As such, it outlines steps that individuals can take to minimise the privacy risks from data scraping such as understanding and managing privacy settings and reading information provided by the SMC or other website about how they share personal information, including the privacy policy.

Ultimately, the responsibility lies with the operator, however, and user empowerment will only be achieved if the safeguards provided by the platform are clearly explained and easily accessible. While the joint statement talks in terms of recommendations and what operators should be doing, it is clear that the measures are based on regulatory responsibilities and insofar as “the practices outlined in this joint statement reflect common global data protection principles and practices”, many will be mandatory at least to some extent in many jurisdictions.

Maintaining the security of personal data is a fundamental accountability obligation in the EU and UK, for example. As such, the statement expressly aims to “set out key areas for SMCs and other websites to focus on … so that they are compliant with data protection and privacy laws around the world”.

From a commercial perspective, the statement makes the inevitable but important point that adhering to these expectations to protect against data scraping “will also support SMCs and other websites in building the trust and confidence of their userbase”. On that score, SMCs operating in the UK may wish to take up the ICO’s invitation to respond to the statement and demonstrate how they protect people from unlawful scraping and publicise measures they implement on the platform.

One additional point of interest is that it is not just data protection that raises concerns with regards to the practice of data scraping. The practice has been challenged in both the EU and US courts, with differing outcomes. In the EU, the CJEU held in a claim brought by Ryanair under the Database Directive, that website operators can set contractual restrictions that prohibit “scraping” from their sites whereas in the US the Ninth Circuit Court of Appeal held in hiQ Labs v LinkedIn that the scraping of publicly available information on a website will not be unlawful even where the user terms of service of the site prohibit such actions provided the scraping activity does not harm the website being scraped.

It remains to be seen if the different approaches under civil law will be more aligned by those enforcing data protection laws, especially as the various US state privacy laws, including California’s Consumer Privacy Act, which have recently come into, or are coming into force, offer more GDPR-like protections to personal information.