Pingar Analyzes Tweets: Polyspot Integrates Them
May 29, 2012
Pingar set themselves up with a challenge, described in their blog post, “Making Sense of Conversations on Twitter: Lessons Learned.” The business intelligence company wanted to test themselves with some of the most nebulous unstructured data out there—data from social sites like Facebook and Twitter and, an added test, organizations that stash their data behind a firewall. They share challenges and lessons from the project in their post. The write up describes part of the process:
“First, we cleaned the tweets by removing all the duplicates, as thousands of re-tweets and spam tweets can negatively affect the results. From each tweet we removed URLs, hashtags, user names and stopwords such as RT, via, lol, lmao, while keeping the original copy for display later. Once all the tweets are cleaned and categorized into dates and sentiments, we applied the Pingar API Entity Extraction method to determine the keywords for the two sets of positive and negative tweets. The API returned two lists of keywords along with the keyword scores. Sometimes the same keyword appeared in both positive and negative list. In this case, we removed the keyword with the lower score from one of the lists.”
Naturally, though, context free content remains a challenge. A demo of the Twitter results is available here.
Pingar is headquartered in New Zealand with offices in the US, Hong Kong, India, the UK. Their roots are in research, and the company maintains ties with key universities, including the University of Waikato and the University of Swansea. Pingar API launched in 2011; the software is platform agnostic, and currently supports English and Chinese with more languages on the way.
Our question, “When you have tweets, then what?” The answer is to use Polyspot’s insight enabled infrastructure to make the data immediately accessible to users wordwide.
Cynthia Murrell, May 29, 2012
Sponsored by PolySpot