In my last post, I introduced the issue of ethical considerations around data collection and analytics. I asked if, along with the significant expansion of technical capabilities over the last five years, there had been a corresponding increase in awareness and advancement of the ethical issues of big data, such as privacy, confidentiality, transparency, protection, and ownership. In this post, I focus on the disclosures provided to users and suggest some elements concerning ethics to include in them.

The complexities of ethics policies

As I've researched the topic of big data, I've come to realize that developing a universal ethics policy will be difficult because of the diversity of data that's collected and the many uses for this data—in the areas of finance, marketing, medicine, law enforcement, social science, and politics, to name just a few.

Privacy and data usage policies are often disclosed to users signing up for particular applications, products, or services. My experience has been that the details about the data being collected are hidden in the customer agreement. Normally, the agreement offers no "opt-out" of any specific elements, so users must either decline the service altogether or begrudgingly accept the conditions wholesale.

But what about the databases that are part of public records? Often these public records are created without any direct communication with the affected individuals. Did you know that in most states, property records at the county level are available online to anyone? You can look up property ownership by name or address and find out the sales history of the property, including prices, square footage, number of bedrooms and baths, often a floor plan, and even the name of the mortgage company—all useful information for performing a pricing analysis for comparable properties, but also useful for a criminal to combine with other socially engineered information for an account takeover or new-account fraud attempt. Doesn't it seem reasonable that I should receive a notification or be able to document when someone makes such an inquiry on my own property record?

Addressing issues in the disclosure

Often, particularly with financial instruments and medical information, those collecting data must comply with regulations that require specific disclosures and ways to handle the data. The following elements together can serve as a good benchmark in the development of an ethical data policy disclosure:

  • Type of data collected and usage. What type of data are being collected and how will that data be used? Will the data be retained at the individual level or aggregated, thereby preventing identification of individuals? Can the data be sold to third parties?
  • Accuracy. Can an individual review the data and submit corrections?
  • Protection. Are people notified how their data will be protected, at least in general terms, from unauthorized access? Are they told how they will be notified if there is a breach?
  • Public versus private system. Is it a private system that usually restricts access, or a public system that usually allows broad access?
  • Open versus closed. Is it a closed system, which prevents sharing, or is it open? If it's open, how will the information will be shared, at what level, and with whom? An example of an open system is one that collects information for a governmental background check and potentially shares that information with other governmental or law enforcement agencies.
  • Optional versus mandatory. Can individuals decline participation in the data collection, or decline specific elements? Or is the individual required to participate such that refusal results in some sort of punitive action?
  • Fixed versus indefinite duration. Will the captured data be deleted or destroyed on a timetable or in response to an event—for example, two years after an account is closed? Or will it be retained indefinitely?
  • Data ownership. Do individuals own and control their own data? Biometric data stored on a mobile phone, for example, are not also stored on a central storage site. On the other hand, institutions may retain ownership. Few programs are under user ownership, although legal rights governing how the data can be used may be made by agreement.

What elements have I missed? Do you have anything to suggest?

In my next post, I will discuss appropriate guiding principles in those circumstance when individuals have no direct interaction with the collection effort.

Photo of David Lott By David Lott, a payments risk expert in the Retail Payments Risk Forum at the Atlanta Fed