Metadata Collection Spike: Is There a Reason?

May 6, 2018

I read “NSA Triples Metadata Collection Numbers Sucking Up over 500 Million Call Records in 2017.” Interesting report, but it raised several questions here in Harrod’s Creek. But first, let’s look at the “angle” of the story.

I noted this statement:

The National Security Agency revealed a huge increase in the amount of call metadata collected, from about 151 million call records in 2016 to more than 530 million last year — despite having fewer targets.

The write up pointed out that penetration testing and trace and tap orders declined. That’s interesting as well.

The write up focused on what’s called “call detail records.” These, the write up explained, are:

things like which numbers were called and when, the duration of the call, and so on…

The write up then reminds the reader that “one target can yield hundreds or thousands of sub-targets.”

The article ends without any information about why. My impression of the write up is that the government agency is doing something that’s not quite square.

My initial reaction to the data in the write up was, “That does not seem like such a big number.” A crawl of the Dark Web, which is a pretty tiny digital space, often generates quite a bit of metadata. Stuffing the tiny bit of Dark Web data into a robust system operated by companies from Australia to the United States can produce terabytes of data. In fact, one Israeli company uploads new data in zipped block to its customers multiple times a day. The firm of which I am thinking performs this work for outfits engaged in marketing consumer products. In comparison, the NSA effort strikes me as modest.

My first question, “Why so little data?” Message, call, image, and video data are going up. The corresponding volume of metadata is going up. Toss in link analysis pointers, and that’s a lot of data. In short, the increase reported seems modest.

The second question is, “What factors contributed to the increase?” Based on our research, we think that some of the analytic systems are bogged down due to the wider use of message encryption technology. I will be describing one of these systems in my June 2018 Telestrategies ISS lecture related to encrypted chat. I wonder if the change in the volume reported in the write up is related to encryption.

My third question is, “Is government analysis of message content new or different?” Based on the information I have stumbled upon here in rural Kentucky, my thought is that message traffic analysis has been chugging along for decades. I heard an anecdote when I worked at a blue chip consulting firm. It went something like this:

In the days of telegrams, the telegraph companies put paper records in a bag, took them to the train station in Manhattan, and sent them to Washington, DC.

Is the anecdote true or false? My hunch is that it is mostly true.

My final question triggered by this article is, “Why does the government collect date?” I suppose the reasons are nosiness, but my perception is that the data are analyzed in order to get a sense of who is doing what which might harm the US financial system or the country itself.

My point is that numbers without context are often not helpful. In this case, the 2010 Pew Data reported that the average adult with a mobile makes five calls per day. Text message volume is higher. With 300 million people in the US in 2010 and assuming 30 percent mobile phone penetration, the number of calls eight years ago works out to about 1.5 billion calls. Flash forward to the present. The “number” cited in the article seems low.

Perhaps the author of the article could provide more context, do a bit of digging to figure out why the number is what it is, and explain why these data are needed in the first place.

One can criticize the US government. But I want to know a bit more.

Net net: It seems that the NSA is showing quite a bit of focus or restraint in its collection activities. In the May 16, DarkCyber, I report the names of some of the companies manufacturing cell site simulators. These gizmos are an interesting approach to data collection. Some of the devices seem robust. To me, capturing 500 million calls seems well within the specifications of these devices.

But what do I know? I can see the vapor from a mine drainage ditch from my back window. Ah, Kentucky.

Stephen E Arnold, May 6, 2018

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta