Inteltrax: Top Stories, October 24 to October 28

October 31, 2011

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, the economic challenges that are realized and overcome thanks to the use of big data and analytics.

The best example of this situation that we found came from our story, “BI’s a Part of Germany’s Strong Economy,” showcased the fascinating trend of how one of the few thriving European economies is directly tied to business intelligence and data analytics.

The story, “Analytic Jobs a Possible Economic Solution,” discussed how analytic work has been steady while other industries dry up. Could data analysis be the fix to sluggish economies?

Another economic staple, FICO credit scores, were magnified in the story, “Pushing 60, FICO Adjusts to Analytics.” Here, we discovered how the credit giant takes the massive amounts of personal data to streamline its analytic system.”

No matter how you slice it, economics is a hot topic these days. We were pleased to discover a positive side to this talk when paired with analytics. We are optimistic about this union in the future and will continue giving it our attention at IntelTrax.

Follow the Inteltrax news stream by visiting

Patrick Roland, Editor, Inteltrax.

October 31, 2011

Datameer Creates Analytics Platform for Hadoop

October 31, 2011

Software development company Datameer  has come up with another Hadoop  business intelligence play to maintain the compounded 40 percent per year growth rate in corporate data volume, with the lion’s share of the growth in unstructured data, being produced and consumed.

There are current technical challenges that need to be addressed. Hadoop is moving out costly analytic databases and warehouses, in its push forward has given us yet another crazy acronym—ADBMS. Now Hadoop vendors keeping the Big Data market in a state of churn.

In the Datameer blog write up “Why I Am at Datameer”  Brian Smith discusses a potential solution to this issue. He asserted:

Datameer is the first BI/Analytics platform built natively on Hadoop. On the surface it sounds interesting, but in practice the solution is game-changing. The Datameer Analytic Solution (DAS) connects business users directly with the entire volume and variety of their raw Hadoop data and makes it available for comprehensive analysis.

While Smith’s assertions are certainly interesting, we are not sure who is “first” in many of the assertions about the Big Data world. IBM is chugging away. Digital Reasoning is a player. There are, in fact, dozens of companies making claims and counter-claims. Perhaps in a dicey economy, marketing takes precedence over cold, hard facts?

Jasmine Ashton, October 31, 2011

Sponsored by

How Will Oracle Play the Endeca Card?

October 31, 2011

With all the speculation around the latest move in the big data arms race, much speculation is abound. Forrester, a blog for e-business and channel strategy professionals, tries their hand at shedding light on the subject with their article, “Oracle Buys Endeca: What It Means.”

One reason this article proposes for the Endeca purchase is that it fits into Oracle’s grand scheme to be on top of the “customer experience management” or CXM trend. Endeca is that piece of the puzzle that supplements ATG, Seibel, and Fatwire as an experience management solution to offer individualized solutions to clients.

One theme that runs through many of the articles published on this subject is Oracle’s comprehensive enterprise solutions marketing angle that becomes a selling point with Endeca on board.

The article points out that this move gives Endeca stability, but for Oracle this acquisition presents a sea of options. The article states the following:

For Oracle this signals commerce and [customer experience management]  are strategic focus and that they are aiming to drive a differentiated offering. The devil is in the details though, and how Oracle executes to drive value through the sum of the parts will be a huge challenge and leaves room.

Oracle is in a position now where they need to invest in order to make its content technologies competitive. What this write up sparked for me was the realization that buzzwords like CXM really don’t mean too much. Oracle bought Endeca * after * Hewlett Packard acquired Autonomy and big a big deal about building an enterprise services business. My view is that Endeca and its aging technology are not likely to be changed too much going forward. Oracle now has duplicative systems for search, customer support, business intelligence, and online commerce. Oracle wants customers and a way to build its revenue for its core software and for services. Endeca, like InQuira, is less about technology and more about having companies to upsell. As for enterprise search, my view is that the market is of little interest to Oracle at this time.

Megan Feil, October 31, 2011

Sponsored by

Sponsored by

Big Data for Big Thinkers

October 31, 2011

“Big data analytics” is an emerging term in the storage industry that originated within the open source community to develop analytics processes that were faster and more scalable than traditional data warehousing.

Open source advocates hope to use data to extract value from the vast amounts of unstructured data produced daily by web users. I recently read an interesting Karmasphere write up called “Big Data IS Different— I Knew It!” in which Rich Guth mused about his past year spent at Karmasphere. In the period, his opinion of Big Data requires different analytic techniques than traditional business intelligence products provide. Guth asserted:

Today we announced version 1.5 of our Karmasphere Analyst product, a workspace for performing Big Data Analytics. It implements a new workflow for data analysts to mine and analyze Big Data.  We also released a whitepaper “Deriving Intelligence from Big Data in Hadoop – A Big Data Analytics Primer” that describes this workflow, discusses why this workflow is necessary and compares it to traditional BI and data warehousing approaches.

The challenge is to make clear exactly what “old methods” will not work and which “new methods” will work. As important, how does a person using a system with new Big Data methods determine if the outputs are accurate. Who wants to make a decision only to find out that the underlying set up of the new methods were off the mark. Most business intelligence professionals don’t know when an old and well worn method is delivering accurate outputs. Toss in a snappy graphic and the disconnect may become significant.

Jasmine Ashton, October 31, 2011

Sponsored by

OpenSearchServer Revealed

October 31, 2011

In an exclusive interview, Raphael Perez, chief executive officer of OpenSearchServer, explains how his firm will further disrupt the traditional enterprise search market. (The full text of the interview is available on the enterprise search subsite.

OpenSearchServer project started in a French B2B media group in 2007. The company was looking for a search solution. The project emerged when a group of engineers became frustrated with the commercial search solutions.

Raphael Perez said:

Because no available solution was available at a decent price or offering all wished features, decision were made to create an in-house solution as an open source project based on Lucene. Emmanuel Keller, then the chief information officer, led the projects, and after two years of work more than 12 applications were installed and providing high value results. In December 2009, Emmanuel purchased the rights of the solution and formed a company to develop the community and offer them high level professional services, support, community management. It was the start of the story.

Today OpenSearchServer is one of a number of firms using Lucene as a component of a commercial enterprise search solution. One of the value adds his engineering team has crafted is a Classifier. He told me:

One a major module our Enterprise customers appreciate is called Classifier. It brings a very innovative set of features for applications with automatic classifications, matching and that are very appreciated in many businesses. Offering this module helps us to bring a nice differentiation for customers. Also we offer log reporting tools and a SOAP Web service.

The firm has a number of clients, including a rich media firm, an investment firm, and the vehicle information vendor ETAI, an Infopro group company.

You can read the full text of the exclusive interview in the enterprise search service, Search Wizards Speak. Search Wizards Speak is the largest collection of first person narratives about search, content processing, and analytics available without charge. There are more than 50 interviews available in the series.

For more information about OpenSearchServer, navigate to the company’s Web site.

Stephen E Arnold, October 31, 2011

Sponsored by

Protected: The Top Fortune 500 SharePoint Users

October 31, 2011

This content is password protected. To view it please enter your password below:

Books Evolve. Publishers? Maybe Not.

October 30, 2011

The publishing industry is one of the main fields that has changed drastically in recent years. With mobility leading the way in technology sales, this will only change more. Publishing Perspectives notes this trend in their article, “What Publishers Look For When They Buy a Company.”

This article chronicles the history of the industry. Acquisitions have been a profitable component of publishing businesses since the 1960s because they offered the ability to diversify and combine functions.

Since 2007, when the first iPhone was released in addition to the Kindle later that same year, the characteristics that publishers seek in a company have changed with the times.

The article shares the following inside information:

First and foremost, they are looking for another seat at the table. Every one of the top 20 companies has a strong technology component and are active buyers of independent companies with creative technology programs. Hundreds of smaller publishers use Constellation, a service offered by Perseus, to make use of electronic readers, digital book search, print on demand, and other digital formats.

Apparently some smaller companies looking to merge are holding out for the publisher who sees their long histories of profit in their specific niche. The current trajectory does not bode well for them in our opinion.

What it comes down to, historical context aside, is that publishers need companies that stand a chance in rivaling the big dogs like Google and Amazon who essentially may monopolize the current market. Disintermediation, anyone?

Megan Feil, October 30, 2011

Sponsored by

2012: Enterprise Search Yields to Metadata?

October 30, 2011

Oh, my. The search dragon has been killed by metadata.

You might find yourself on an elevator ready to get off on a specific floor. The rest of your trip will start from that point and that point only. The same is true for learning, conversing, actually just about anything. We all have a particular place we want to enter the conversation. MSDN’s Microsoft Enterprise Content Management (ECM) Team Blog’s recent posting on “Taxonomy: Starting from Scratch” was a breath of fresh air in the way it addressed anyone–no matter what floor they needed.

For the novices to Managed Metadata Service, a service providing tools to foster a rich corporate taxonomy, the article recommends a starting point: Introducing Enterprise Metadata Management

According to the article. The more seasoned users are reminded to point their browsers towards import capabilities. Of course, there are more specific needs, and links to go with them, addressed too.

The article recommends the following for the clients who need a comprehensive understanding of both common and specific corporate terms. The author Ryan Duguid states:

“The General Business Taxonomy consists of around 500 terms describing common functional areas that exist in most businesses.  The General Business Taxonomy can be imported in to the SharePoint 2010 term store within minutes and provides a great starting point for customers looking to build a corporate vocabulary and take advantage of the Managed Metadata Service.”

Overall, this article is worth keeping tucked away for a day when you might need information on WAND, SharePoint, or metadata and taxonomy in general because of the directness and the accessible next steps the variety of links offer.

Megan Feil, October 30, 2011

Sponsored by

Software and Smart Content

October 30, 2011

I was moving data from Point A to Point B yesterday, filtering junk that has marginal value. I scanned a news story from a Web site which covers information technology with a Canadian perspective. The story was “IBM, Yahoo turn to Montreal’s NStein to Test Search Tool.” In 2006, IBM was a pace-setter in search development cost control The company was relying on the open source community’s Lucene technology, not the wild and crazy innovations from Almaden and other IBM research facilities. Web Fountain and jazzy XML methods were promising ways to make dumb content smart, but IBM needed a way to deliver the bread-and-butter findability at a sustainable, acceptable cost. The result was OmniFind. I had made a note to myself that we tested the Yahoo OmniFind edition when it became available and noted:

Installation was fine on the IBM server. Indexing seemed sluggish. Basic search functions generated a laundry list of documents. Ho hum.

Maybe this comment was unfair, but five years ago, there were arguably better search and retrieval systems. I was in the midst of the third edition of the Enterprise Search Report, long since batardized by the azure chip crowd and the “real” experts. But we had a test corpus, lots of hardware, and an interest is seeing for ourselves how tough it was to get an enterprise search system up and running. Our impression was that most people would slam in the system, skip the fancy stuff, and move on to more interesting things such as playing Foosball.

Thanks to Adobe for making software that creates a need for Photoshop training. Source:

Smart, Intelligent… Information?

In this blast from the past article, NStein’s product in 2006 was “an intelligent content management product used by media companies such as Time Magazine and the BBC, and a text mining tool called NServer.” The idea was to use search plus a value adding system to improve the enterprise user’s search experience.

Now the use of the word “intelligent” to describe a content processing system, reaching back through the decades to computer aided logistics and forward to the Extensible Markup Language methods.

The idea of “intelligent” is a pregnant one, with a gestation period measured in decades.

Flash forward to the present. IBM markets OmniFind and a range of products which provide basic search as a utility function. NStein is a unit of OpenText, and it has been absorbed into a conglomerate with a number of search systems. The investment needed to update, enhance, and extend BASIS, BRS Search, NStein, and the other systems OpenText “sells” is a big number. “Intelligent content” has not been an OpenText buzzword for a couple of years.

The torch has been passed to conference organizers and a company called Thoora, which “combines aggregation, curation, and search for personalized news streams.” You can get some basic information in the TechCrunch article “Thoora Releases Intelligent Content Discovery Engine to the Public.”

In two separate teleconference calls last week (October 24 to 28, 2011), “intelligent content” came up. In one call, the firm was explaining that traditional indexing system missed important nuances. By processing a wide range of content and querying a proprietary index of the content, the information derived from the content would be more findable. When a document was accessed, the content was “intelligent”; that is, the document contained value added indexing.

The second call focused on the importance of analytics. The content processing system would ingest a wide range of unstructured data, identify items of interest such as the name of a company, and use advanced analytics to make relationships and other important facets of the content visible. The documents were decomposed into components, and each of the components was “smart”. Again the idea is that the fact or component of information was related to the original document and to the processed corpus of information.

No problem.

Shift in Search

We are witnessing another one of those abrupt shifts in enterprise search. Here’s my working hypothesis. (If you harbor a life long love of marketing baloney, quit reading because I am gunning for this pressure point.)

Let’s face it. Enterprise search is just not revving the engines of the people in information technology or the chief financial officer’s office. Money pumped into search typically generates a large number of user complaints, security issues, and cost spikes. As content volume goes up, so do costs. The enterprise is not Google-land, and money is limited. The content is quite complex, and who wants to try and crack 1990s technology against the nut of 21st century data flows. Not I. So something hotter is needed.

Second, the hottest trends in “search” have nothing to do with search whatsoever. Examples range from conflating the interface with precision and recall. Sorry. Does not compute for me. The other angle is “mobile.” Sure, search will work  when everything is monitored and “smart” software provides a statistically appropriate method suggests will work “most” of the time. There is also the baloney about apps, which is little more than the gameification of what in many cases might better be served with a system that makes the user confront actual data, not an abstraction of data. What this means is that people are looking for a way to provide information access without having to grunt around in the messy innards of editorial policies, precision, recall, and other tasks that are intellectually rigorous in a way that Angry Birds interfaces for business intelligence are not.

Third, companies engaged in content access are struggling for revenue. Sure, the best of the search vendors have been purchased by larger technology companies. These acquisitions guarantee three things.

  1. The Wild West spirit of the innovative content processing vendors is essentially going to be stamped out. Creativity will be herded into the corporate killing pens, and the “team” will be rendered as meat products for a technology McDonald’s
  2. The cash sink holes that search vendors research programs were will be filled with procedure manuals and forms. There is no money for blue sky problem solving to crack the tough problems in information retrieval at a Fortune 1000 company. Cash can be better spent on things that may actually generate a return. After all, if the search vendors were so smart, why did most companies hit revenue ceilings and have to turn to acquisitions to generate growth? For firms unable to grow revenues, some just fiddled the books. Others had to get injections of cash like a senior citizen in the last six months of life in a care facility. So acquired companies are not likely to be hot beds of innovation.
  3. The pricing mechanisms which search vendors have so cleverly hidden, obfuscated, and complexified will be tossed out the window. When a technology is a utility, then giant corporations will incorporate some of the technology in other products to make a sale.

What we have, therefore, is a search marketplace where the most visible and arguably successful companies have been acquired. The companies still in the marketplace now have to market like the Dickens and figure out how to cope with free open source solutions and giant acquirers who will just give away search technology.

Read more

Enterprise Search: Floundering Fish Update

October 29, 2011

I saw a comment from one expert who thought I misspelled flounder. Nope, I know a flounder is a fish; namely, any of various marine flatfishes of the families Bothidae and Pleuronectidae, which include important food fishes, according to the Free Dictionary. But who can trust fee, anyway?

Flounder, however, has another meaning; namely, to make clumsy attempts to move or regain one’s balance. Again, that free source.

What I found interesting is that one of the readers of my free Web log wanted me to know that I really, honestly, truly must have meant founder as in “sinking below the surface” or “to stumble,” not the word I used, flounder.

Well, bless my local sushi joint. I was thinking about fish. Even a goose like me knows that most fish out of water have some challenges. I did briefly think of the word “founder”, but I preferred the fish word for three reasons:

First, it has a metaphorical ring. There is the suffocation thing. I also like the faint echo of “fish like relatives stink after three days,” but I would update to “real consultants like fish stink after three days.”

Second, I could indulge in a bit of graphic whimsy; for example:

Third, like the entire notion of search vendors gasping for revenue the way a fish moves its gills in an attempt to survive when the fishing boat docks and the catch is up for auction.

I will stick to my metaphor the way sticky rice adheres to tuna at Tea Station Restaurant, which is down the road a “fur piece”. (Please, don’t correct my Kentucky-ism.) I am delighted someone stands ready to correct my word choice in my free blog. Did I mention my and free? I am even more impressed when a variant of flounder without the fishy bit is presented as what I really meant.

Nope, “real” consultant, I meant what I wrote. By the way, I did okay on the vocabulary section of my SAT, in 1961, thank you.

Here’s a useful fish reminder: ““When you fish for love, bait with your heart, not your brain.” Maybe I should extend the “flounder” metaphor to the self appointed search experts, failed Webmasters who know about knowledge management, and “real” journalists who pray that the iPad will pull newspapers and magazines out of hot water? Think lobster. Dead lobster.

Stephen E Arnold, October 29, 2011

Sponsored by, a publisher in Norway which understands the way of the herring.

Next Page »

  • Archives

  • Recent Posts

  • Meta