Department of Defense: Learning from Social Media Posts

May 25, 2019

A solicitation request dated May 13, 2019, “A–Global Social Media Archive, 350 billion digital data records” is an interesting public message. Analysis of social media allegedly has been a task within other projects handled by firms specializing in content analytic. These data mining efforts are, based on DarkCyber’s understanding of open source information from specialist vendors, are nothing new. The solicitation offers some interesting insights which may warrant some consideration.

First, the scope of the task is 350 billion digital records. It is not clear what a “digital record” constitutes, but the 350 billion number represents about two or three months of Facebook posts. It is not clear if the content comes from one service like Twitter or is drawn from a range of messaging and content sources.

Second, the content pool must include 60 languages. The most used languages on the public Internet are English, Chinese, and Spanish. The other 57 languages contribute a small volume of content, and this fact may create a challenge for the vendors responding to the solicitation. The document states:

Data includes messages from at least 200 million unique users in at least 100 countries, with no single country accounting for more than 30% of users.

Third, the text content and the metadata must be included in the content bundle.

The exclusion of photographs and videos is interesting. These are important content mechanisms. Are commercial enterprises operating without connections to nation states operating large-scale content aggregation systems likely to be able to comply? Worth watching to find out who lands this project.

Stephen E Arnold, May 25, 2019

Written by Stephen E. Arnold · Filed Under Government, News, Social Media

Comments

One Response to “Department of Defense: Learning from Social Media Posts”

Hairstyles on July 7th, 2019 4:35 am

Thanks for your write-up. I also believe laptop computers have gotten more and more popular these days, and now are often the only form of computer used in a household. Simply because at the same time actually becoming more and more affordable, their working power keeps growing to the point where they may be as effective as desktop computers out of just a few in years past.

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.