Metadata - have we got the ethics right?

Metadata - have we got the ethics right?

Guest post by David Haynes, author of Metadata for Information Management and Retrieval: Understanding Metadata and its Use, Second Edition

Use of metadata by the security services

"Metadata tells you everything about somebody's life.  If you have enough metadata you don't really need content" (Schneier 2015, p.23)

If anyone wondered about the importance of metadata, this quote by Stuart Baker of the US National Security Agency should leave no one in any doubt.  The Snowden revelations about the routine gathering of metadata about international telephone calls to or from the United States continues to have repercussions today (Greenwald 2013).  Indeed Privacy International (2017) has identified the following types of metadata that is gathered or could be gathered by security agencies:

  • Location
  • Device used
  • Date/time
  • Sender
  • Recipient
  • Length of call

"Metadata in aggregate is content" as Jacob Appelbaum observed when the Wikileaks controversy first blew up  (Democracy Now 2013).  In other words when metadata from different sources is aggregated it can be used to reconstruct the information content of individual communications.

Photo by Matthew Henry on UnsplashInvasion of privacy or personal benefit?

These concerns extend well beyond the use of metadata by Governments and the security services.  The social media giants prosper by exploiting personal data and targeting digital advertising.  Personal profiles of targeted individuals are based on metadata about online use and are the basis of online behavioural advertising.  Cookies and other tracking technologies can monitor the online activity of an individual to predict future behaviour.  Metadata about online sessions reveals a great deal about an individual and his or her life.  This may extend to gathering information about friends, family, colleagues and other contacts.

The upside of this is that metadata is a powerful tool to facilitate use of online services, by remembering users' preferences and delivering content that is more likely to be of interest or relevance to them.  This has to be balanced against the risks associated with online disclosure of personal data.


Metadata describes an information object whether that be raw data or more descriptive information about an individual.  This is important because the treatment of metadata has become a political issue.  Personal data, especially data that reveals opinions, attitudes and beliefs is potentially very sensitive.  Use of this personal data by service providers or by third parties can expose users to risks such as nuisance from unwanted ads, harassment from internet trolls or fraud through identity theft, if the data is not held or transmitted security.  Many digital advertisers would say that because the data is aggregated it is not possible to identify individuals - i.e. the data is anonymised.  However this is no protection against privacy breaches as has been demonstrated by Narayanan and Shmatikov (2009) and others.

Fact-free content

Daniel Rosenberg (2013) makes a nice distinction between data, facts and evidence.  Data if true may be a fact, but if false ceases to be a fact.  Samuel Arbesman (2012) in his book 'The Half Life of Facts' introduced the idea that in a given period half the certainties that we had are shown to be false or are superceded by new understandings and that they cease to be 'facts'.  Data, whether it is true or not, continues to be data, but is only factual if true.  Perhaps there is some way of recording the reliability of information or data so that it can be exploited appropriately.  Many of the arguments and counter-arguments on climate change for instance centre on the quality and veracity of the evidence used by each side of the debate.  This idea is not new, as medical researchers have for some time evaluated the quality of research used to make clinical decisions.  This information about the quality and reliability of data is metadata.

Metadata is political

Metadata has become a political issue because of its use by security agencies and because of wider privacy issues in the commercial world.  Anyone who had asked the question 'What does metadata matter?' prior to 2013 will realise just how important a bearing it has on current political issues.  The Fourth Amendment to the U.S. Constitution protects 'The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures' (United States 1791).  A lot hangs on the interpretation of privacy as Solove (2011) has so eloquently discussed in his book 'Nothing to Hide'.  'Fake news' is not new, but the phenomenon has reared its head in recent elections and is unlikely to go away any time soon.  Good governance also depends on a good understanding of metadata and accountability for past actions.

book cover for Metadata for Information Management and Retrieval: Understanding Metadata and its Use, Second EditionMetadata for information management and retrieval

In the new edition of Metadata for Information Management and Retrieval, published in January 2018 I consider the origins of metadata and look at the ways in which it is used for managing information resources.  The ethical dimensions of metadata are explored and issues such as governance, privacy, security and human rights are considered.  The book also discusses the digital divide and the potential that metadata has for making information accessible to wider audiences.

Metadata has an important role in politics and ethics.  How then do we manage it to best effect?

Haynes, D (2018) Metadata for Information Management and Retrieval: Understanding Metadata and its Use, Second Edition ISBN 9781856048248. Facet Publishing. London, 2018, 267pp.

You can follow David on Twitter @JDavidHaynes


Arbesman, S., 2012. The half-life of facts: why everything we know has an expiration date,

Democracy Now, 2013. Court: Gov't Can Secretly Obtain Email, Twitter Info from Ex-WikiLeaks Volunteer Jacob Appelbaum. Available at: [Accessed March 21, 2017].

Greenwald, G., 2013. NSA Collecting Phone Records of Millions of Verizon Customers Daily. The Guardian. Available at: [Accessed July 7, 2014].

Narayanan, A. & Shmatikov, V., 2009. De-anonymizing Social Networks. In 2009 30th IEEE Symposium on Security and Privacy. IEEE, pp. 173-187.

Privacy International, 2017. Privacy 101. Metadata. Available at: [Accessed March 23, 2017].

Rosenberg, D., 2013. Data before the Fact. In L. Gitelman, ed. "Raw Data" is an Oxymoron. Cambridge, MA: MIT Press, pp. 15-40.

Schneier, B., 2015. Data and Goliath: the hidden battles to collect your data and control your world, New York, NY: W.W.Norton.

Solove, D.J., 2011. Nothing to Hide: the false tradeoff between privacy and security, New Haven, CT: Yale University Press.

United States, 1791. U.S. Constitution Amendment IV, United States.

This post originally appeared in a somewhat different form on the Facet Publishing blog