Protected: Ikanow: Creating Pathways through Information
March 1, 2012
Hadoop Technology: Calling All Mathematicians!
February 26, 2012
Scalability and big data solutions are not simply buzzwords thrown around the search industry. These are both key items in assessing value of platforms, and are both key reasons users are drawn to Hadoop technology.
However, the fact that Hadoop is picking up steam poses a major problem to those attempting to find talent to work the technology. People experienced in Hadoop are hard to come by. Cloudera, IBM, Hortonworks, and MapR are all investing in Hadoop training programs, choosing to invest in internal candidates rather than trying to hire new talent. A related article on, “Hadoop Wins Over Enterprise IT, Spurs Talent Crunch” asserts on the topic:
‘We originally thought we needed to find a hardcore Java developer,’ Return Path’s Sautins says. But in reality, the talent that’s best suited for working with Hadoop isn’t necessarily a Java engineer. ‘It’s somebody who can understand what’s going on in the cluster, is interested in picking up some of these tools and figuring out how they work together, and can deal with the fact that pretty much everything in the Hadoop ecosystem is not even a 1.0 release yet,’ Sautins says. ‘That’s a real skill set.’
The problem of finding talent could eventually limit the continued adoption of Hadoop technology. Search analytics is now opening doors for those with deep math skills and backgrounds in statistics and science. People with this basic skills can be taught how to use these tools, and will be very valuable to a great number of companies adopting this technology.
Andrea Hayden, February 26, 2012
Sponsored by
Wolfram Alpha Pro Now Available
February 23, 2012
Wolfram|Alpha continues to make changes to build its user base and traffic. The computational engine’s blog is “Announcing Wolfram|Alpha Pro.” As with so many companies that offer something for nothing, you do have to pay for the full Pro version, just like the good old days of Dialog and SDC Orbit. There is, though, a trial subscription available. Stephen Wolfram writes:
We’ve been able to go a remarkably long way with the basic paradigm of ordinary Wolfram|Alpha. But now Wolfram|Alpha Pro dramatically extends this paradigm—and it’s going to be exciting to see all the new things that become conceivable. But for now, I hope that as many people as possible will use Wolfram|Alpha Pro, and will take advantage of the largest single step in the development of Wolfram|Alpha since it was first launched.
The expanded capabilities include a number of different features. For one, you can set preferences to make the engine more efficient for your needs. You can also now download the raw data behind any query. My favorite, though, is the ability to upload, or point to a URL for, an image for analysis. The same can be done with blocks of data in any format. See the write up for more details on Wolfram|Alpha’s new abilities. It is well worth checking out. A beanie with a propeller may be required for some query formulations, however.
We wonder, will Google embrace this approach, offering their products free with ads but for a fee for “value added” service?
Cynthia Murrell, February 23, 2012
Sponsored by
Semantics Fuel Need for Analytics
February 22, 2012
Here’s a different approach to the “next big thing.” Network Computing insists, “Semantic Technology Key to Mastering Data Growth, Analysis.” The article examines the recent InformationWeek report titled Database Discontent.
It used to be that data analysis parameters were defined manually. However, says the report’s co-author David Read, that is becoming less and less feasible. Writer Chris Talbot explains:
With the significant depth and breadth of data contained inside and outside the enterprise, in addition to the high volume of transactions that are continually generating more data, there is no reasonable way for people to know where to look when seeking out actionable knowledge, Read said. Predictive analytics will likely outpace reporting and traditional business intelligence efforts in the future, and they will be used to inform SMEs [Subject Matter Experts] about where to invest their business intelligence efforts, he added.
SQL systems are fine for analyzing uniform data, he adds, but not the growing mounds of unstructured data. The report sees semantic technology as the answer to the problem. Talbot notes that these tools have both improved and come down in price over the last few years. The way things are going, that’s a very good thing.
Cynthia Murrell, February 22, 2012
Sponsored by
Palantir Applies Lipstick, Much Lipstick
February 16, 2012
I had three people send me a link to the Washingtonian article “Killer App.” On the surface, the write up is about search and content processing, predictive analytics, and the value of these next generation solutions. Underneath the surface, I see more of a public relations piece. but that’s just my opinion.
Let me point out that the article was more of a political write up than a technology article. Palantir, in my opinion, has been pounding the pavement, taking journalists to Starbuck’s, and working overtime. The effort is understandable. In 2010 and 2011, Palantir was involved in a dispute with i2 Group, now a unit of IBM, about intellectual property. The case was resolved and the terms of the settlement were not revealed. I know zero about the legal hassles but I did pick up some information that suggested the i2 Group was not pleased with Palantir’s ability to parse Analyst Notebook file types.
I steered clear of the hassle because in the past I have done work for i2 Ltd., the predecessor to the i2 Group. I know that the file structure was a closely held and highly prized chunk of information. At any rate, the dust is now settling, and any company with some common sense would be telling its story to anyone who will listen. Palantir has a large number of smart people and significant funding. Therefore, getting publicity to support marketing is a standard business practice.
Now what’s with the Washingtonian article? First, the Washington is a consumer publication aimed at the affluent, socially aware folks who live in the District, Maryland, and Virginia. The story kicks off with a description of Palantir’s system which can parse disparate information and make sense of items which would be otherwise lost in the flood of data rushing through most organizations today. The article said:
To conduct what became known as Operation Fallen Hero, investigators turned to a little-known Silicon Valley software company called Palantir Technologies. Palantir’s expertise is in finding connections among people, places, and events in large repositories of electronic data. Federal agents had amassed a trove of reporting on the drug cartels, their members, their funding mechanisms and smuggling routes.
Then the leap:
Officials were so impressed with Palantir’s software that seven months later they bought licenses for 1,150 investigators and analysts across the country. The total price, including training, was $7.5 million a year. The government chose not to seek a bid from some of Palantir’s competitors because, officials said, analysts had already tried three products and each “failed to provide the necessary comprehensive solution on missions where our agents risk life and limb.” As far as Washington was concerned, only Palantir would do. Such an endorsement would be remarkable if it were unique. But over the past three years, Palantir, whose Washington office in Tysons Corner is just six miles from the CIA’s headquarters, has become a darling of the US law-enforcement and national-security establishment. Other agencies now use Palantir for some variation on the challenge that bedeviled analysts in Operation Fallen Hero—how to organize and catalog intimidating amounts of data and then find meaningful insights that humans alone usually can’t.
Sounds good. The only issue is that there are a number of companies delivering this type of solution. The competitors range from vendors of SharePoint add ins to In-Q-Tel funded Digital Reasoning to JackBe, a mash up and fusion outfit in Silver Spring, Maryland. Even Google is in the game via its backing of Recorded Future, a company which asserts that it can predict what will happen. There are quite sophisticated services provided by low profile SAIC and SRA International. I would toss in my former employers Halliburton and Booz, Allen & Hamilton, but these firms are not limited to one particular government solution. Bottom line: There are quite a few heavy hitters in this market space. Many of them outpace Palantir’s technology and Palantir’s business methods, in my opinion
In short, Palantir is a relative newcomer in a field of superstar technology companies. In my opinion, the companies providing predictive solutions and data fusion systems are like the NFL Pro Bowl selections. Palantir is a player, and, in my opinion, a firm which operates at a competitive level. However, Palantir is not the quarterback of the winning team.
From my viewpoint in Harrod’s Creek, the Washingtonian writes about Palantir without providing substantive context. In-Q-Tel funds many organizations and has taken heat because many of these firms’ solutions are stand alone systems. Integrations without legal blow back is important. Firms which end up in messy litigation increase security risks; they do not reduce security risks. Short cuts are not unknown in Washington political circles. It is important to work with companies which demonstrate high value behaviors, avoid political and legal mud fights, and deliver value over time.
The Washingtonian article tells an interesting story, but it is a bit like a short story. Reality has been shaped I believe. Palantir is presented out of context, and I think that the article is interesting for three reasons:
- What it asserts about a company which is one of a number of firms providing next generation intelligence solutions
- The magazine itself which presented a story which reminded me of a television late night advertorial
- The political agenda which reveals something about Washington journalism.
In short, an quite good example of 21st century “real” journalism. That lipstick looks good. Does it contain lead?
Stephen E Arnold, February 16, 2012
Sponsored by
Lexalytics and Document Summarization
February 15, 2012
No humans required, or that’s the premise.
Lexalytics which is best known for its text analysis engine highlights their text summarization tool. According to Lexalytics:
Summarization is an algorithmic shortening of the input content so as to best represent the whole content in a limited amount of words.
It all starts at the sentence level. The application is able to pick out the most important or representative sentences within the content and use them for the summary. Lexical chaining is involved in the actual choosing of the representative sentences. The company asserts that
“Lexical Chaining relates sentences via thesaurally-related noun” and regardless of where the sentences appear in the text if the nouns are related to each other they can be lexically related. In other wards the longest chain represents the best content and the first sentence of this chain will be the first sentence of the summary. The same procedure is done for the second-longest chain and so on. This is definitely a “chain reaction.”
April Holmes, February 15, 2012
Sponsored by
Hadoop Vendors On the Rise
February 13, 2012
Information Week offers the interesting article “12 Hadoop Vendors to Watch In 2012.” Hadoop is a favorite in the business intelligence world “thanks to its combination of low cost, scalability, and flexibility to handle any data without building predefined schemas.”
Business intelligence vendors are counting on Hadoop to help with not only data processing but also with data analysis. The article mentions several notable companies. Cloudera is not surprising it is “the oldest and largest Hadoop software and services provider.”
Other vendors such as EMC and Microsoft are two surprising vendors noted in the article with Hadoop connections. Datameer is another notable vendor building steam and you can read more about them here. An interesting list however it comes as a big surprise that Digital Reasoning was left off of the list which is a huge oversight for so many reasons in my opinion. The list of vendors couldn’t be more different but data analytics bridges the gap. It’s definitely “the next big thing.”
Stephen E Arnold, February 13, 2012
Sponsored by
Politicians Try to Surf on Social Media
February 12, 2012
Is this a new type of polling or is it social trolling? Attensity’s blog reports, “Politico Uses Attensity to Analyze SOPA Sentiment.” Attensity took on Politico’s challenge to mine social media for attitudes on the Stop Online Piracy Act. It turns out that people who spend a lot of time online skew heavily against the law. Go figure.
Author James Purchase writes:
If I had to directly summarize this analysis, I would say that the SOPA-opposition is significantly more organized and vocal in using Social Media to make their point. Whether or not the social media outcry affects the outcome of the legislation remains to be seen.
Perhaps, though I hope the uproar against the law has reached the ears of even the most tech-adverse legislators. They have interns, right? Some are awkward too. Wipe out!
Cynthia Murrell, February 12, 2012
Sponsored by
Linguamatics Embraces Informatics
February 9, 2012
Fierce Biotech IT announces, “EU Program Backs Linguamatics and ChemAxon’s Informatics Work.” The European Union’s Eurostars Program grants research and development funding to small and medium companies.
The project being funded is, according to the companies, the first interactive text-mining system specifically for chemistry research. Writer Ryan McBride elaborates:
The companies say that pharma and biotech outfits are expected to be the main customers for the technology. With this tool, ChemAxon and Linguamatics want drug companies or other users to be able to do chemical evaluations, hunt for new chemicals, get structure visualizations in searches and ‘explore image to structure conversion,’ according to the companies’ press release.
More personalized medical research is expected to be one application of the system. That sounds promising.
ChemAxon serves the biotechnology and pharmaceutical fields worldwide, providing chemical software development platforms as well as desktop applications.
Linguamatics bases its data management solutions on natural language processing technology. I2E is the company’s flagship text mining software, also available in the cloud as I2E OnDemand.
Cynthia Murrell, February 9, 2012
Sponsored by
Inteltrax: Top Stories, January 30 to February 3, 2012
February 6, 2012
Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, how governments are embracing and utilizing big data analytics, especially during this early stage in the 2012 political cycle.
We got a good overall look at the issue from the story, “Government Healthcare and Analytics Make a Good Team,” showed how, as the title implies, this pairing is making some impressive waves in the world.
Another story, “Social Media and Politics Share Big Data Love” showed us how Ron Paul and others have utilized social media to get a better take on the issues.
Finally, the most promising of our stories, “Government Grows Into Big Data Workhorse” shows how governments around the globe could kick start a big data revolution.
Analytics and big data are growing by leaps and bounds. However, it seems as if government can be its best friend and often tries to be so. We’re going to keep chronicling this partnership, because we sense big things on the horizon.
Follow the Inteltrax news stream by visiting
Patrick Roland, Editor, Inteltrax, February 6, 2012
Sponsored by