The Case for Data Regulation

Lessons form history

Every major, and perhaps most minor, advances in technological development have come with drawbacks. Farming provided us with food security but forced us to settle and live in communities, resulting in low-hygiene living areas where diseases could spread quickly throughout communities. Antibiotics allowed us to defeat many previously fatal diseases, and made many surgical interventions a low-risk procedure, yet abusing them lead to resistant strains of bacteria.

As a (future) data scientist, you might be very much excited about what the stream of personal information can offer. As am I. And so should we all. After all, data science is set on the path for a momentous change. From health sciences to personal management, data science applications can do much about improving our life. Data availability is at an all-time high and increasing. Computational technologies enable us to process data easier than ever, and some of the results are nothing short of remarkable.

Yet, as it has been illustrated in Cathy O’Neil’s Weapons of Math Destruction, a book discussing the (mis)applications of algorithms, there are several ways of misusing it. Legislation could help prevent misuse. But legislation can be slow to adjust, and it is constrained by the tradeoff between safety and innovation. Over-regulate and you will constrain innovation; don’t, and you will allow room for misuse.

As of May 25th this year, the General Data Protection Regulation (GDPR) is coming into effect throughout the EU. In a nutshell, the new legislation is aimed at protecting privacy by empowering individuals to seek information with regards to the data held on them as well as how it is used by the organisation in question. It also gives individuals the “right to be forgotten”, as long as information isn’t in “the public interest”. The legislation applies to all organisations operating within the EU territory, regardless of their official location, and failure to comply can result in penalties of “up to 4% of annual global turnover or €20 Million (whichever is greater)”. In addition, it forces organisations to keep internal records, notify subjects of data breaches within 72 hours and data-intensive organisations are required to appoint a “Data Protection Officer”.

This is a major step forward. Granting people access to their information, and how it’s being used, should uphold organisations not only to act within the law, but to a higher ethical standard. After all, knowing that your customers are watching what you do with their data makes you think twice about how far you’re willing to go.

Or does it? In many respects, this is a similar scenario to the ‘chase-up’ between regulators and financial institutions over the last two centuries or so.

First, there is little choice, as an integrated member of society, as to whether you want give up your data, similar to the historically increasing necessity of consuming financial products in order to function within society. While abstaining from using social media accounts is plausible, many social interactions require the use of data-driven products, which implicitly leave a trace of information about you.

Then again, why would you want that? Just as it is the case with financial institutions, data-driven societies can make everyone better off. Think about health-related analytics and the ability to cure or prevent medical conditions, or even simpler things, such as Google’s traffic information feed in Google Maps.

A second similar feature shared with financial products is complexity. Even if people or organisations are granted access to what and how information is used, how likely are they to understand the implications? Cathy O’Neil makes the case that, though not necessarily intended, algorithms pick up and act on societal biases, for example. What’s more, similar to financial products, even experts cannot always predict the outcomes of complex products, moslty expressed in the form of algorithms for data-driven applications.

Further, even if we were to fully understand the products, what is the likelihood that they’re going to adjust behaviour? People have several behavioural biases, which could lead an individual or a collective to act against his or their best interest.

Third, there’s the misalignment of incentives. Just like financial institutions have an interest to use financial products in pursuit of a profit, so do data-driven organisations have an incentive to use data to derive value. A financial institution does so by constructing complex products with poorly understood — and often hidden — risk aspects (take the 2007/2008 financial crisis, for example). By the same token, a data-driven institution could use complex algorithms with poorly understood risks and outcomes in an effort to produce results.

Lastly, and perhaps most importantly, just as financial institutions play a critical role in the smooth functioning of modern society, so do data-driven institutions and applications. For financial products, systemic risk (the chance that when something goes wrong with one institution, an entire industry or economy is affected) has been a strong reason for regulating financial institutions throughout history.

For data-driven organisations few systemic threats have been revealed yet, and most of them are on the horizon. Ms. O’Neil illustrated some, from interference with democracy to systematic discrimination against minorities. New technologies are being developed and are set to be deployed on a global scale. These data-driven applications are likely to influence society on a global scale. How they are put to use and how well understood they are are likely to influence how well society functions.

Regardless, the new legislation is set to come into effect and it’s a step towards the better and, just as financial institutions have become increasingly stable, so are data-driven institutions and applications likely to become. The question is, what can we learn from how legislators and financial institutions dealt with the road to stability and how can we avoid some of the pitfalls.

Note: I first wrote this article in January 2018 as part of The Outlier publication on medium.

Dragos Tomescu
Dragos Tomescu
Data Trainer (previously, Data Analyst)

A data analytics professional with a passion for understanding society. I write about data-driven applications and their impact on business and society.

Related