Semantics Fuel Need for Analytics

February 22, 2012

Here’s a different approach to the “next big thing.” Network Computing insists, “Semantic Technology Key to Mastering Data Growth, Analysis.”  The article examines the recent InformationWeek report titled Database Discontent.

It used to be that data analysis parameters were defined manually. However, says the report’s co-author David Read, that is becoming less and less feasible. Writer Chris Talbot explains:

With the significant depth and breadth of data contained inside and outside the enterprise, in addition to the high volume of transactions that are continually generating more data, there is no reasonable way for people to know where to look when seeking out actionable knowledge, Read said. Predictive analytics will likely outpace reporting and traditional business intelligence efforts in the future, and they will be used to inform SMEs [Subject Matter Experts] about where to invest their business intelligence efforts, he added.

SQL systems are fine for analyzing uniform data, he adds, but not the growing mounds of unstructured data. The report sees semantic technology as the answer to the problem. Talbot notes that these tools have both improved and come down in price over the last few years. The way things are going, that’s a very good thing.

Cynthia Murrell, February 22, 2012

Sponsored by Pandia.com

Palantir Applies Lipstick, Much Lipstick

February 16, 2012

I had three people send me a link to the Washingtonian article “Killer App.” On the surface, the write up is about search and content processing, predictive analytics, and the value of these next generation solutions. Underneath the surface, I see more of a public relations piece. but that’s just my opinion.

Let me point out that the article was more of a political write up than a technology article. Palantir, in my opinion, has been pounding the pavement, taking journalists to Starbuck’s, and working overtime. The effort is understandable. In 2010 and 2011, Palantir was involved in a dispute with i2 Group, now a unit of IBM, about intellectual property. The case was resolved and the terms of the settlement were not revealed. I know zero about the legal hassles but I did pick up some information that suggested the i2 Group was not pleased with Palantir’s ability to parse Analyst Notebook file types.

I steered clear of the hassle because in the past I have done work for i2 Ltd., the predecessor to the i2 Group. I know that the file structure was a closely held and highly prized chunk of information. At any rate, the dust is now settling, and any company with some common sense would be telling its story to anyone who will listen. Palantir has a large number of smart people and significant funding. Therefore, getting publicity to support marketing is a standard business practice.

Now what’s with the Washingtonian article? First, the Washington is a consumer publication aimed at the affluent, socially aware folks who live in the District, Maryland, and Virginia. The story kicks off with a description of Palantir’s system which can parse disparate information and make sense of items which would be otherwise lost in the flood of data rushing through most organizations today. The article said:

To conduct what became known as Operation Fallen Hero, investigators turned to a little-known Silicon Valley software company called Palantir Technologies. Palantir’s expertise is in finding connections among people, places, and events in large repositories of electronic data. Federal agents had amassed a trove of reporting on the drug cartels, their members, their funding mechanisms and smuggling routes.

Then the leap:

Officials were so impressed with Palantir’s software that seven months later they bought licenses for 1,150 investigators and analysts across the country. The total price, including training, was $7.5 million a year. The government chose not to seek a bid from some of Palantir’s competitors because, officials said, analysts had already tried three products and each “failed to provide the necessary comprehensive solution on missions where our agents risk life and limb.” As far as Washington was concerned, only Palantir would do. Such an endorsement would be remarkable if it were unique. But over the past three years, Palantir, whose Washington office in Tysons Corner is just six miles from the CIA’s headquarters, has become a darling of the US law-enforcement and national-security establishment. Other agencies now use Palantir for some variation on the challenge that bedeviled analysts in Operation Fallen Hero—how to organize and catalog intimidating amounts of data and then find meaningful insights that humans alone usually can’t.

Sounds good. The only issue is that there are a number of companies delivering this type of solution. The competitors range from vendors of SharePoint add ins to In-Q-Tel funded Digital Reasoning to JackBe, a mash up and fusion outfit in Silver Spring, Maryland. Even Google is in the game via its backing of Recorded Future, a company which asserts that it can predict what will happen. There are quite sophisticated services provided by low profile SAIC and SRA International. I would toss in my former employers Halliburton and Booz, Allen & Hamilton, but these firms are not limited to one particular government solution. Bottom line: There are quite a few heavy hitters in this market space. Many of them outpace Palantir’s technology and Palantir’s business methods, in my opinion

In short, Palantir is a relative newcomer in a field of superstar technology companies. In my opinion, the companies providing predictive solutions and data fusion systems are like the NFL Pro Bowl selections. Palantir is a player, and, in my opinion, a firm which operates at a competitive level. However, Palantir is not the quarterback of the winning team.

From my viewpoint in Harrod’s Creek, the Washingtonian writes about Palantir without providing substantive context. In-Q-Tel funds many organizations and has taken heat because many of these firms’ solutions are stand alone systems. Integrations without legal blow back is important. Firms which end up in messy litigation increase security risks; they do not reduce security risks. Short cuts are not unknown in Washington political circles. It is important to work with companies which demonstrate high value behaviors, avoid political and legal mud fights, and deliver value over time.

The Washingtonian article tells an interesting story, but it is a bit like a short story. Reality has been shaped I believe. Palantir is presented out of context, and I think that the article is interesting for three reasons:

  1. What it asserts about a company which is one of a number of firms providing next generation intelligence solutions
  2. The magazine itself which presented a story which reminded me of a television late night advertorial
  3. The political agenda which reveals something about Washington journalism.

In short, an quite good example of 21st century “real” journalism. That lipstick looks good. Does it contain lead?

Stephen E Arnold, February 16, 2012

Sponsored by Pandia.com

Lexalytics and Document Summarization

February 15, 2012

No humans required, or that’s the premise.

Lexalytics which is best known for its text analysis engine highlights their text summarization tool. According to Lexalytics:

Summarization is an algorithmic shortening of the input content so as to best represent the whole content in a limited amount of words.

It all starts at the sentence level. The application is able to pick out the most important or representative sentences within the content and use them for the summary. Lexical chaining is involved in the actual choosing of the representative sentences. The company asserts that

“Lexical Chaining relates sentences via thesaurally-related noun” and regardless of where the sentences appear in the text if the nouns are related to each other they can be lexically related. In other wards the longest chain represents the best content and the first sentence of this chain will be the first sentence of the summary. The same procedure is done for the second-longest chain and so on. This is definitely a “chain reaction.”

April Holmes, February 15, 2012

Sponsored by Pandia.com

Hadoop Vendors On the Rise

February 13, 2012

Information Week offers the interesting article “12 Hadoop Vendors to Watch In 2012.” Hadoop is a favorite in the business intelligence world “thanks to its combination of low cost, scalability, and flexibility to handle any data without building predefined schemas.”

Business intelligence vendors are counting on Hadoop to help with not only data processing but also with data analysis. The article mentions several notable companies. Cloudera is not surprising it is “the oldest and largest Hadoop software and services provider.”

Other vendors such as EMC and Microsoft are two surprising vendors noted in the article with Hadoop connections. Datameer is another notable vendor building steam and you can read more about them here. An interesting list however it comes as a big surprise that Digital Reasoning was left off of the list which is a huge oversight for so many reasons in my opinion. The list of vendors couldn’t be more different but data analytics bridges the gap. It’s definitely “the next big thing.”

Stephen E Arnold, February 13, 2012

Sponsored by Pandia.com

Politicians Try to Surf on Social Media

February 12, 2012

Is this a new type of polling or is it social trolling? Attensity’s blog reports, “Politico Uses Attensity to Analyze SOPA Sentiment.” Attensity took on Politico’s challenge to mine social media for attitudes on the Stop Online Piracy Act. It turns out that people who spend a lot of time online skew heavily against the law. Go figure.

Author James Purchase writes:

If I had to directly summarize this analysis, I would say that the SOPA-opposition is significantly more organized and vocal in using Social Media to make their point. Whether or not the social media outcry affects the outcome of the legislation remains to be seen.

Perhaps, though I hope the uproar against the law has reached the ears of even the most tech-adverse legislators. They have interns, right? Some are awkward too. Wipe out!

Cynthia Murrell, February 12, 2012

Sponsored by Pandia.com

Linguamatics Embraces Informatics

February 9, 2012

Fierce Biotech IT announces, “EU Program Backs Linguamatics and ChemAxon’s Informatics Work.” The European Union’s Eurostars Program grants research and development funding to small and medium companies.

The project being funded is, according to the companies, the first interactive text-mining system specifically for chemistry research. Writer Ryan McBride elaborates:

The companies say that pharma and biotech outfits are expected to be the main customers for the technology. With this tool, ChemAxon and Linguamatics want drug companies or other users to be able to do chemical evaluations, hunt for new chemicals, get structure visualizations in searches and ‘explore image to structure conversion,’ according to the companies’ press release.

More personalized medical research is expected to be one application of the system. That sounds promising.

ChemAxon serves the biotechnology and pharmaceutical fields worldwide, providing chemical software development platforms as well as desktop applications.

Linguamatics  bases its data management solutions on natural language processing technology. I2E is the company’s flagship text mining software, also available in the cloud as I2E OnDemand.

Cynthia Murrell, February 9, 2012

Sponsored by Pandia.com

Inteltrax: Top Stories, January 30 to February 3, 2012

February 6, 2012

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, how governments are embracing and utilizing big data analytics, especially during this early stage in the 2012 political cycle.

We got a good overall look at the issue from the story, “Government Healthcare and Analytics Make a Good Team,”  showed how, as the title implies, this pairing is making some impressive waves in the world.

Another story, “Social Media and Politics Share Big Data Love”  showed us how Ron Paul and others have utilized social media to get a better take on the issues.

Finally, the most promising of our stories, “Government Grows Into Big Data Workhorse”  shows how governments around the globe could kick start a big data revolution.

Analytics and big data are growing by leaps and bounds. However, it seems as if government can be its best friend and often tries to be so. We’re going to keep chronicling this partnership, because we sense big things on the horizon.

Follow the Inteltrax news stream by visiting www.inteltrax.com.

Patrick Roland, Editor, Inteltrax, February 6, 2012

Sponsored by Pandia.com

Semantic Wranglers to Tame Media Content

February 6, 2012

When the prolificacy of the media scape overwhelms, it is semantic technology to the rescue. So declares ReadWriteWeb in “Semantic Tech the Key to Finding Meaning in the Media.” Writer Chris Lamb maintains that today’s deluges of information have made attention span the prize, and delivering relevancy the key. Strategies have included tapping readers’ social graphs, profiles, and preferences to filter news content. Lamb writes:

These current approaches are doomed. With respect to social graph curation, people have different roles at during different times. On the weekend, a reader might be interested in arts, entertainment and sports news based on a friends and family. During the week, this same person may be interested in business news based on recommendations from trading partners in the capital markets. How do readers seamlessly reconcile this?

Lamb doesn’t have the answer, but says he does know what technologies will underlie the eventual solutions: tagging, semantic extraction, disambiguation, and linked data structures (including cloud data). See the write up for more the reasoning behind each.

Semantic technology can perform useful functions. Rich media pose some special challenges. Among them are the issues of data volume and available processing power, latency, and variability in indexable content. What about a silent movie? What about a program which features interviews with individuals with a substance abuse problem who speak colloquially with a mumble?

Cynthia Murrell, February 6, 2012

Sponsored by Pandia.com

Craig Norris Leaves Attensity

February 2, 2012

Chiliad has issued the press release, “New CEO Begins Duties at CHILIAD in Herndon, VA.” Craig Norris is leaving Attensity to head that company. Attensity, owned by Aeris Capital, is positioned as a global natural language analytics company. Chiliad seems to be its direct competitor. Interesting.

Chiliad Chairman Patrick Gross noted a couple of challenges his company’s new CEO has already tackled:

The first is the ability to rapidly search data collections at greater scale than any other offering in the market. The second is to allow search formulation and analysis in natural language. This means that no longer is an elite class of analysts required in order to generate meaningful results, thus reducing the personnel training and skills shortages that plague alternative solutions and put timely discovery at risk. The explosion of ‘Big Data’ is real and valuable findings are buried in vast collections for both enterprises and governments. Chiliad has the opportunity to integrate its innovative, massively scalable solutions with emerging open source software to build customized solutions for the largest-scale clients.

It will be interesting to see how the market reacts to this shift.

Cynthia Murrell, February 2, 2012

Search Only Goes So Far

January 30, 2012

Infocentric Research surveyor Stephan Schillerwein, who presented his findings at the Online Information Conference, released some alarming statistics about enterprise search in his report “The Digital Workplace.” Among the points which jumped out at me were 40 percent of employees use the wrong information when conducting enterprise searches and 63 percent “make critical decisions without being informed,” which results in a 25 percent work information productivity loss.

According to the Pandia Search Engine News Article “Huge Problems for Search In the Enterprise” Schillerwein believes there are a few reasons why enterprise search is problematic. Users don’t account for the fact that enterprise search is different from Web Search, they have unrealistic expectations and there is a clear problem of lack of content. The Pandia article asserted: Schillerwein suggests a solution based on several elements, such as consistent coverage of information flows for processes, bringing together the worlds of structured and unstructured information, and adding context. I would agree as this ability to combine structured and unstructured data while maintaining context is key in our approach. However, when you combine the crowded jumble of tweets, social media and other data that crowd employees’ smart devices the problems with enterprise search could continue to take a downward spiral and “finding a needle in a haystack” could be easier than doing an enterprise search.

These observations triggered several questions and observations.

First, there are a number of companies offering enterprise information solutions. Many are focused on the older approach of key word queries. There are business intelligence systems which provide “find-ability” tools along with a range of useful analytic features. Although search is not the focal point of these solutions, they do provide useful visualizations and statistics on content. The problem is that most organizations are confused about what is needed and what must be done to maximize the value of systems which go beyond key word retrieval. This confusion is likely to play a far larger role in enterprise search challenges than many market analysts want to acknowledge. Instead, many solutions today seem to be making information access more confusing and problematic, not clearer and more trouble free.

Second, the challenge may be more directly related to figuring out what specific business process needs which information. Without a clear understanding of the user’s requirements, it may be difficult to deploy a system that delivers higher user satisfaction. If this hypothesis is correct, perhaps more vendors should adopt the approach we have taken at Digital Reasoning. We make an extra effort to understand what the user requires and then invest time and resources in hooking appropriate information and data into the system. No solution can deliver the right fact-based answers if the required information is not within the data store and available to the algorithms which make sense of what is otherwise noise? We think that many problems with user acceptance originate with a misunderstanding or sidestepping of user requirements and the fundamental task of getting the necessary information for the system.

Third, the terminology used to describe information retrieval and access is becoming devalued. At Digital Reasoning, we work to explain succinctly and without jargon how our next-generation system can facilitate better decision making for financial, health, intelligence, and other professional markets. We have complex numerical recipes and sophisticated systems and methods. Our focus, however, is on what the system does for a user. We have been fortunate to receive support from a range of clients from government and industry as well as the investment community for our next-generation approach. We think our strength is our focus on the customer’s need and not only our unique predictive algorithms and cloud-based solution.

To learn more about Digital Reasoning and our products, navigate to www.digitalreasoning.com .

Dave Danielson, Digital Reasoning, January 30, 2012

Sponsored by Pandia.com

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta