Edward Snowden, in leaking certain intelligence documents, brought the amount of data collected by the United States for intelligence purposes into focus within the public discourse. It was a feat that no individual had done before. Well, actually, a feat that no individual had done since the Carter Administration. For those of us who are too young to remember, or who just plain forgot, Jimmy Carter mandated Executive Order 12036 (following up on actions initiated by Gerald Ford) which imposed several restrictions upon the US intelligence agencies in response to concerns that they were operating beyond the law, especially in regards to American citizens.
Today’s new-found public conceptualization of the enormity of the volume and array of data collected, so-called “big data,” has led to a degree of outrage amongst the general public. Nonetheless, the public concern is one that is decidedly tardy, perhaps born out of a sense of complacency vis-à-vis the actions of US intelligence agencies that developed after Carter’s executive order appeasing critics’ at the time who warned about the abuse of information by intelligence agencies.
The concerns of the 1970s (and before) should have been revisited much more frequently. The capacity to collect information has grown together with advancements in technology at an exponential rate and an era of “big data,” along with the privacy concerns resulting from improved and expanded data collection, was predictable and understood well before Snowden’s disclosures. Moreover, there was, and to a large extent still is, inadequate acknowledgement within the political and public spheres of these concerns, clearly expressed in the academic literature, perhaps most recently following the implementation of the USA Patriot Act of 2001. What that means is that there is no indication of how any measures to increase transparency will be implemented and sustained overtime.
So, in short, the collecting of intelligence is nothing new and there should have not been this big “surprise” of “big data” and its threat to civil liberties, purely from the standpoint of how data collection and storage has progressed over time. To give you a visual, the rate of data collection can be described by a partial logistic curve.
Starting just before the turn of the eighteenth century there were advances in various technologies that, over time, would greatly aid intelligence collection and dissemination such as the hot air balloon (1783), the daguerreotype (1836) and the calotype (1840) (both forerunners to the modern camera), and the telegraph (1844) to name a few important developments. Refinements and expansions of these technologies along with the invention of the airplane further increased the possibilities for information gathering and transfer, thus causing the slope of the curve to steepen as time progressed. Nonetheless, the amount of information collected and analyzed, even until this point was relatively small.
The information revolution changed that. The computer age allowed difficult tasks, such as calculations, which in turned allowed for the faster development of advanced technologies. Since the dawn of the information revolution, computing power has increased exponentially in a phenomenon described by Gordon Moore who predicted that the density of components per integrated circuit will double at a regular interval.
While Moore predicted that interval to be two years, Moore’s law has proven to be approximately 18 months, in part to a coordinated effort to meet the benchmarks over the past 40 years, thus defining an innovation curve of circuit development. Given that improvements in integrated circuits aid in the automated collection, storage, and retrieval of information, it stands to reason that the amount of data that can be effectively collected and stored runs roughly parallel to Moore’s law. The trend may experience a potential delayed effect due to timeframes required to adopt new technology and maximize its efficiency.
There is no doubt that the computer age (point B in figure 1) completely changed several industries, many of which correlate to the increased capacity to gather information, such as land surveying, information technology (i.e. telecommunications and data transfer), market research. Information can now be gathered more efficiently and accurately than ever before and sent around the word in fractions of a second. Many tasks that were previously manual can now be automated, thus reducing the need for human actors to complete tasks and the information collected can be more accurate as it is independent, outside of its programming of course, of human error.
A corollary to this trend is that information is also produced at an exponentially increasing rate. As technology has improved, so has data generation. Therefore, in absolute terms, the volume of information generated has increased rapidly over this time. Google CEO Eric Schmidt claimed in 2010 that the same amount of information is produced and stored in about two days as the amount of information produced and stored from the dawn of civilization until 2003 (point C in figure 1).
Though the facts of the claim have been disputed, its gist remains true: we produce an exponentially greater volume of data compared to the time prior to 2003. With more data generated, there is more data to collect, purely in absolute terms. As a result, the slope of segment BC has become steeper as time has gone on. Though it may not be apparent in the graph, the slope of segment CD continues to get exponentially steeper as it continues on into infinity. That will remain true as long as Moore’s law holds. Once Moore’s law fails, then the curve will plateau, thus taking on a shape similar to that of a logistic curve (basically an s for the mathematically uninitiated).
It is important to note that this graph says nothing about the capacity to analyze the data nor what that analysis might result in. That is a question to consider another time, and forces us to consider whether or not we trust the state enough to collect data on us in exchange for its pledged furnishing of security to us. To put it another way, understanding the volume of data collected and the fact that the state is proactively collecting it forces us to consider the question “how does the act of collecting and safekeeping data (to be clear, not analyzing it or disseminating it) affect civil liberty?”
What is important to remember is that the Snowden disclosures should be concern not just to the United States and its citizens but for every government and person worldwide.