Twitter and Mining Tweets

February 21, 2010

I must admit. I get confused. There is Twitter, TWIT (a podcast network), TWIST (a podcast from another me-too outfit), and “tweets”. If I am confused, imagine the challenge for text processing and then analyzing short messages.

Without context, a brief text message can be opaque to someone my age; for example, “r u thr”. Other messages say one thing, “at the place, 5” and mean to an insider “Mary’s parents are out of town. The party is at Mary’s house at 5 pm.”

When I read “Twitter’s Plan to Analyze 100 Billion Tweets”, several thoughts struck me:

  1. What took so long?
  2. Twitter is venturing into some tricky computational thickets. Analyzing tweets (the word given to 140 character messages sent via Twitter and not to be confused with “twits”, members of the TWIT podcast network) is not easy.
  3. Non US law enforcement and intelligence professionals will be paying a bit more attention to the Twitter analyses because Twitter’s own outputs may be better, faster, and cheaper than setting up exotic tweet subsystems.
  4. Twitter makes clear that it has not analyzed its own data stream, which surprises me. I thought these young wizards were on top of data flows, not sitting back and just reacting to whatever happens.

According to the article, “Twitter is the nervous system of the Web.” This is a hypothetical, and I am not sure I buy that assertion. My view is that Google’s more diverse data flows are more useful. In fact, the metadata generated by observing flows within Buzz and Wave are potentially a leapfrog. Twitter is a bit like one of those Faith Popcorn-type of projects. Sniffing is different from getting the rare sirloin in a three star eatery in Lyon.

The write up points out that Twitter will use open source tools for the job. There are some juicy details of how Twitter will process the traffic.

A useful write up.

Stephen E Arnold, February 22, 2010

No one paid me to write this article. I will report non payment to the Department of Labor, where many are paid for every lick of work.

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta