Data Companies Poised to Leverage Open Data
July 27, 2015
Support for open data, government datasets freely available to the public, has taken off in recent years; the federal government’s launch of Data.gov in 2009 is a prominent example. Naturally, some companies have sprung up to monetize this valuable resource. The New York Times reports, “Data Mining Start-Up Enigma to Expand Commercial Business.”
The article leads with a pro bono example of Enigma’s work: a project in New Orleans that uses that city’s open data to identify households most at risk for fire, so the city can give those folks free smoke detectors. The project illustrates the potential for good lurking in sets of open data. But make no mistake, the potential for profits is big, too. Reporter Steve Lohr explains:
“This new breed of open data companies represents the next step, pushing the applications into the commercial mainstream. Already, Enigma is working on projects with a handful of large corporations for analyzing business risks and fine-tuning supply chains — business that Enigma says generates millions of dollars in revenue.
“The four-year-old company has built up gradually, gathering and preparing thousands of government data sets to be searched, sifted and deployed in software applications. But Enigma is embarking on a sizable expansion, planning to nearly double its staff to 60 people by the end of the year. The growth will be fueled by a $28.2 million round of venture funding….
“The expansion will be mainly to pursue corporate business. Drew Conway, co-founder of DataKind, an organization that puts together volunteer teams of data scientists for humanitarian purposes, called Enigma ‘a first version of the potential commercialization of public data.’”
Other companies are getting into the game, too, leveraging open data in different ways. There’s Reonomy, which supplies research to the commercial real estate market. Seattle-based Socrata makes data-driven applications for government agencies. Information discovery company Dataminr uses open data in addition to Twitter’s stream to inform its clients’ decisions. Not surprisingly, Google is a contender with its Sidewalk Labs, which plumbs open data to improve city living through technology. Lohr insists, though, that Enigma is unique in the comprehensiveness of its data services. See the article for more on this innovative company.
Cynthia Murrell, July 27, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Quality Peer Reviews Are More Subjective Than Real Science
July 16, 2015
Peer reviewed journals are supposed to have an extra degree of authority, because a team of experts read and critiqued an academic work. Science 2.0 points out in the article, “Peer Review Is Subjective And The Quality Is Highly Variable” that peer-reviewed journals might not be worth their weight in opinions.
Peer reviews are supposed to be objective criticisms of work, but personal beliefs and political views are working their way into the process and have been for some time. It should not come as a surprise, when academia has been plagued by this problem for decades. It also has also been discussed, but peer review problems are brushed under the rug. In true academic fashion, someone is conducting a test to determine how reliable peer review comments are:
“A new paper on peer review discusses the weaknesses we all see – it is easy to hijack peer review when it is a volunteer effort that can drive out anyone who does not meet the political or cultural litmus test. Wikipedia is dominated by angry white men and climate science is dominated by different angry white men, but in both cases they were caught conspiring to block out anyone who dissented from their beliefs. Then there is the fluctuating nature of guidelines. Some peer review is lax if you are a member, like at the National Academy of Sciences, while the most prominent open access journal is really editorial review, where they check off four boxes and it may never go to peer review or require any data, especially if it matches the aesthetic self-identification of the editor or they don’t want to be yelled at on Twitter.”
The peer review problem is getting worse in the digital landscape. There are suggested solutions, such as banning all fees associated with academic journals and databases, homogenizing review criteria across fields, but the problems would be far from corrected. Reviewers are paid to review works, which likely involves kickbacks of some kind. Also trying to get different academic journals, much less different fields to standardize an issue will take a huge amount of effort and work, if they can come to any sort of agreement.
Fixing the review system will not be done quickly and anytime money is involved, the process is slowed even further. In short, academic journals are far from being objective, which is why it pays to do your own research and take everything with a grain of salt.
Whitney Grace, July 16, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Twitter Gets a Search Facelift
June 25, 2015
Twitter has been experimenting with improving its search results and according to TechCrunch the upgrade comes via a new search results interface: “Twitter’s New Search Results Interface Expands To All Users.” The new search results interface is the one of the largest updates Twitter has made in 2015. It is supposed to increase the ease with a cleaner look and better filtering options. Users will now be able to filter search results by live tweets, photos, videos, news, accounts, and more.
Twitter made the update to help people better understand how to use the message service and to take a more active approach to using it, rather than passively reading other peoples tweets. The update is specifically targeted at new Twitter users.
The tweaked search interface will return tweets related to the search phrase or keyword, but that does not mean that the most popular tweets are returned:
“In some cases, the top search result isn’t necessarily the one with the higher metrics associated with it – but one that better matches what Twitter believes to be the searcher’s “intent.” For example, a search for “Steve Jobs” first displays a heavily-retweeted article about the movie’s trailer, but a search for “Mad Men” instead first displays a more relevant tweet ahead of the heavily-favorited “Mad Men” mention by singer Lorde.”
The new interface proves to be simpler and better list trends, related users, and news. It does take a little while to finesse Twitter, which is a daunting task to new users. Twitter is not the most popular social network these day and it’s using these updates to increase its appeal.
Whitney Grace, June 25, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Search Improvements at Twitter
June 18, 2015
Search hasn’t exactly been Twitter’s strong point in the past. Now we learn that the site is rolling out its new and improved search functionality to all (logged-in) users in TechCrunch’s article, “Twitter’s New Search Results Interface Expands to All Web Users.” Reporter Sarah Parez tells us:
“Twitter is now rolling out a new search results interface to all logged-in users on the web, introducing a cleaner look-and-feel and more filtering options that let you sort results by top tweets, ‘live’ tweets, accounts, photos, videos, news and more. The rollout follows tests that began in April which then made the new interface available to a ‘small group’ of Twitter users the company had said at the time. The updated interface is one of the larger updates Twitter’s search engine has seen in recent months, and it’s meant to make the search interface itself easier to use in terms of switching between tweets, accounts, photos and videos.”
Twitter has been working on other features meant to make the site easier to use. For example, the revamped landing page will track news stories in specified categories. Users can also access the latest updates through the “instant timeline” or “while you were away” features. The article supplies a few search-interface before-and-after screenshots. Naturally, Twitter promises to continue improving the feature.
Cynthia Murrell, June 18, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Emojis Spur Ancient Language Practices
May 12, 2015
Emojis, different from their cousin emoticons, are a standard in Internet jargon and are still resisted by most who grew up in a world sans instant connection. Mike Isaac, who writes the New York Times Bits blog, tried his best to resist the urge to use a colon and parentheses to express his mood. Isaac’s post “The Rise Of Emoji On Instagram Is Causing Language Repercussions” discusses the rise of the emoji language.
Emojis are quickly replacing English abbreviations, such as LOL and TTYL. People are finding it easier to select a smiley face picture over having to type text. Isaac points to how social media platforms like Facebook, Twitter, Instagram, and Snapchat users are relying more on these pictograms for communication. Instagram’s Thomas Dimson mentioned we are watching the rise of a new language.
People string emojis together to form complete sentences and sentiments. Snapchat and Instagram rely on pictures as their main content, which in turn serves as communication.
“Instagram itself is a means of expression that does not require the use of words. The app’s meteoric rise has largely been attributed to the power of images, the ease that comes, for instance, in looking at a photo of a sunset rather than reading a description of one. Other companies, like Snapchat, have also risen to fame and popularity through the expressive power of images.”
Facebook and Twitter are pushing more images and videos on their own platforms. It is a rudimentary form of communication, but it harkens back to the days of cave paintings. People are drawn to images, because they are easy to interpret from their basic meaning and they do not have a language barrier. A picture of a dog is still the same in Spanish or English. The only problem from using emojis is actually understanding the meaning behind them. A smiley face is easy to interpret, but a dolphin, baseball glove, and maple leaf might need some words for clarification.
Isaac finishes that one of the reasons he resisted emojis so much was that it made him feel childish, so he reserved them for his close friends and family. The term “childish” is subjective, just like the meaning of emojis, so as they become more widely adopted it will become more accepted.
Whitney Grace, May 12, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Social Network Demographics by the Numbers
April 23, 2015
The amount of social networking Web sites and their purposes is as diverse as the human population. Arguably, if you were to use each of the most popular networks and try to keep on top of every piece of information that filters through the feed, one twenty-four hour day would not be enough.
With social media becoming more ingrained in daily life, it makes one wonder who is using what network and for what purpose. Business Insider discusses a recent BI Intelligence about social media demographics in the article: “Revealed: A Breakdown Of The Demographics For Each Of The Social Networks.” Here are some of the facts: Facebook is still mostly female and remains the top network. Twitter leans heavier on the male demographic, while YouTube reaches more adults in 18-34 demographic than cable TV. Instagram is considered the most important of teenage social networks, but Snapchat has the widest appeal amongst the younger crowd. This is the most important for professionals:
“LinkedIn is actually more popular than Twitter among U.S. adults. LinkedIn’s core demographic are those aged between 30 and 49, i.e. those in the prime of their career-rising years. Not surprisingly, LinkedIn also has a pronounced skew toward well-educated users.”
Facebook still reigns supreme and pictures are popular with the younger sect, while professionals all tend to co-mingle in their LinkedIn area. Surprising and not so revealing information, but still interesting for the data junkie. We wonder how social media will change in the coming year?
Whitney Grace, April 23, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Twitter Plays Hard Ball or DataSift Knows the End Is in Sight
April 11, 2015
I read “Twitter Ends its Partnership with DataSift – Firehose Access Expires on August 13, 2015.” DataSift supports a number of competitive and other intelligence services with its authorized Twitter stream. The write up says:
DataSift’s customers will be able to access Twitter’s firehose of data as normal until August 13th, 2015. After that date all the customers will need to transition to other providers to receive Twitter data. This is an extremely disappointing result to us and the ecosystem of companies we have helped to build solutions around Twitter data.
I found this interesting. Plan now or lose that authorized firehose. Perhaps Twitter wants more money? On the other hand, maybe DataSift realizes that for some intelligence tasks, Facebook is where the money is. Twitter is a noise machine. Facebook, despite its flaws, is anchored in humans, but the noise is increasing. Some content processes become more tricky with each business twist and turn.
Stephen E Arnold, April 11, 2015
Twitter Search: Well, Sort Of
April 9, 2015
I read “Updating Trends on Mobile.” I am more interested in more detailed information about Twitter content, users, and tags. General purpose or massified outputs are of little utility in my little world.
I noted this passage:
We’ve been working to make content easier to find over the last several months in places like your home timeline – with recaps and Tweets from within your network – and through efforts like MagicRecs. We’ll continue to make improvements like these in the future.
If you navigate to the Twitter search page and enter a string like “enterprise search”, you will see variants of the term or phrase expressed as Twitter hash tags. The trends displayed were reflective of what Twitter’s log suggest is hot. Here’s an example:
How many of these trends do you recognize. I knew about iOS 8.3, Apple Watch, and not much else.
Queries for tweets remain a bit problematic for me.
Stephen E Arnold, April 9, 2015
Tweets Reveal Patterns of Support or Opposition for ISIL
March 31, 2015
Once again, data analysis is being put to good use. MIT Technology Review describes how “Twitter Data Mining Reveals the Origins of Support for the Islamic State.” A research team lead by one WalidMagdy at the Qatar Computing Research Institute studied tweets regarding the “Islamic State” (also known as ISIS, ISIL, or just IS) to discern any patterns that tell us which people choose to join such an organization and why.
See the article for a detailed description of the researchers’ methodology. Interesting observations involve use of the group’s name and tweet timing. Supporters tended to use the whole, official name (the “Islamic State in Iraq and the Levant” is perhaps the most accurate translation), while most opposing tweets didn’t bother, using the abbreviation. They also found that tweets criticizing ISIS surge right after the group has done something terrible, while supporters tended to tweet after a propaganda video was released or the group achieved a major military victory. Other indicators of sentiment were identified, and an algorithm created. The article reveals:
“Magdy and co trained a machine learning algorithm to spot users of both types and said it was able to classify other users as likely to become pro- or anti-ISIS with high accuracy. ‘We train a classifier that can predict future support or opposition of ISIS with 87 percent accuracy,’ they say….
“That is interesting research that reveals the complexity of the forces at work in determining support or opposition to movements like ISIS—why people like [Egypt’s] Ahmed Al-Darawy end up dying on the battlefield. A better understanding of these forces is surely a step forward in finding solutions to the tangled web that exists in this part of the world.
“However, it is worth ending on a note of caution. The ability to classify people as potential supporters of ISIS raises the dangerous prospect of a kind of thought police, like that depicted in films like Minority Report. Clearly, much thought must be given to the way this kind of information should be used.”
Clearly. (Though the writer seems unaware that the term “thought police” originated with Orwell’s Nineteen Eighty-Four, the reference to Minority Report shows he or she understands the concept. But I digress.) Still, trying to understand why people turn to violence and helping to mitigate their circumstances before they get there seems worth a try. Better than bombs, in my humble opinion, and perhaps longer-lasting.
Cynthia Murrell, March 31, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
Partnership Between Twitter and IBM Showing Results
March 27, 2015
The article on TechWorld titled IBM Boosts BlueMix and Watson Analytics with Twitter Integration investigates the fruits of the partnership between IBM and Twitter, which began in 2014. IBM Bluemix now has Twitter available as one the services available in the cloud based developer environment. Watson Analytics will also be integrated with Twitter for the creation of visualizations. Developers will be able to grab data from Twitter for better insights into patterns and relationships.
“The Twitter data is available as part of that service so if I wanted to, for example, understand the relationship between a hashtag on pizza, burgers or tofu, I can go into the service, enter the hashtag and specify a date range,” said Rennie. “We [IBM] go out, gather information and essentially calculate what is the sentiment against those tags, what is the split by location, by gender, by retweets, and put it into a format whereby you can immediately do visualisation.”
From the beginning of the partnership, Twitter gave IBM access to its data and the go-ahead to use Twitter with the cloud based developer tools. Watson looks like a catch all for data, and the CMO of Brandwatch Will McInnes suggests that Twitter is only the beginning. The potential of data from social media is a vast and constantly rearranging field.
Chelsea Kerwin, March 27, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com