Google and Findability without the Complexity

July 28, 2014

Shortly after writing the first draft of Google: The Digital Gutenberg, “Enterprise Findability without the Complexity” became available on the Google Web site. You can find this eight page polemic at http://bit.ly/1rKwyhd or you can search for the title on—what else?—Google.com.

Six years after the document became available, Google’s anonymous marketer/writer raised several interesting points about enterprise search. The document appeared just as the enterprise search sector was undergoing another major transformation. Fast Search & Transfer struggled to deliver robust revenues and a few months before the Google document became available, Microsoft paid $1.2 billion for what was another enterprise search flame out. As you may recall, in 2008, Convera was essentially non operational as an enterprise search vendor. In 2005, Autonomy bought the once high flying Verity and was exerting its considerable management talent to become the first enterprise search vendor to top $500 million in revenues. Endeca was flush with Intel and SAP cash, passing on other types of financial instruments due to the economic downturn. Endeca lagged behind Autonomy in revenues and there was little hope that Endeca could close the gap between it and Autonomy.

Secondary enterprise search companies were struggling to generate robust top line revenues. Enterprise search was not a popular term. Companies from Coveo to Sphinx sought to describe their information retrieval systems in terms of functions like customer support or database access to content stored in MySQL. Vivisimo donned a variety of descriptions, culminating in its “reinvention” as a Big Data tool, not a metasearch system with a nifty on the fly clustering algorithm. IBM was becoming more infatuated with open source search as a way to shift development an bug fixes to a “community” working for the benefit of other like minded developers.

image

Google’s depiction of the complexity of traditional enterprise search solutions. The GSA is, of course, less complex—at least on the surface exposed to an administrator.

Google’s Findability document identified a number of important problems associated with traditional enterprise search solutions. To Google’s credit, the company did not point out that the majority of enterprise search vendors (regardless of the verbal plumage used to describe information retrieval) were either losing money or engaged in a somewhat frantic quest for financing and sales).

Here are the issues Google highlighted:

  • User of search systems are frustrated
  • Enterprise search is complex. Google used the word “daunting”, which was and still is accurate
  • Few systems handle file shares, Intranets, databases, content management systems, and real time business applications with aplomb. Of course, the Google enterprise search solution does deliver on these points, asserted Google.

Furthermore, Google provides integrated search results. The idea is that structured and unstructured information from different sources are presented in a form that Google called “integrated search results.”

Google also emphasized a personalized experience. Due to the marketing nature of the Findability document, Google did not point out that personalization was a feature of information retrieval systems lashed to an alert and work flow component. Fulcrum Technologies offered a clumsy option for personalization. iPhrase improved on the approach. Even Endeca supported roles, important for the company’s work at Fidelity Investments in the UK. But for Google, most enterprise search systems were not personalizing with Google aplomb.

Google then trotted out the old chestnuts gleaned from a lunch discussion with other Googlers and sifting competitors’ assertions, consultants’ pronouncements, and beliefs about search that seemed to be self-evident truths; for example:

  • Improved customer service
  • Speeding innovation
  • Reducing information technology costs
  • Accelerating adoption of search by employees who don’t get with the program.

Google concluded the Findability document with what has become a touchstone for the value of the Google Search Appliance. Kimberly Clark, “a global health and hygiene company,” reduced administrative costs for indexing 22 million documents. The costs of the Google Search Appliance, the consultant fees, and the extras like GSA fail over provisions were not mentioned. Hard numbers, even for Google, are not part of the important stuff about enterprise search.

One interesting semantic feature caught my attention. Google does not use the word knowledge in this 2008 document.

Several questions:

  1. Was Google unaware of the fusion of information retrieval and knowledge?
  2. Does the Google Search Appliance deliver a laundry list of results, not knowledge? (A GSA user has to scan the results, click on links, and figure out what’s important to the matter at hand, so the word “knowledge” is inappropriate.)
  3. Why did Google sidestep providing concrete information about costs, productivity, and the value of indexing more content that is allegedly germane to a “personalized” search experience? Are there data to support the implicit assertion “more is better.” Returning more results may mean that the poor user has to do more digging to find useful information. What about a few, on point results? Well, that’s not what today’s technology delivers. It is a fiction about which vendors and customers seem to suspend disbelief.

With a few minor edits—for example, a genuflection to “knowledge—this 2008 Findability essay is as fresh today as it was when Google output its PDF version.

Several observations:

First, the freshness of the Findability paper underscores the staleness and stasis of enterprise search in the past six years. If you scan the free search vendor profiles at www.xenky.com/vendor-profiles, explanations of the benefits and functions of search from the 1980s are also applicable today. Search, the enterprise variety, seems to be like a Grecian urn which “time cannot wither.”

Second, the assertions about the strengths and weaknesses of search were and still are presented without supporting facts. Everyone in the enterprise search business recycles the same cant. The approach reminds me of my experience questioning a member of a sect. The answer “It just is…” is simply not good enough.

Third, the Google Search Appliance has become a solution that costs as much, if not more, than other big dollar systems. Just run a query for the Google Search Appliance on www.gsaadvantage.gov and check out the options and pricing. Little wonder than low cost solutions—whether they are better or worse than expensive systems—are in vogue. Elasticsearch and Searchdaimon can be downloaded without charge. A hosted version is available from Qbox.com and is relatively free of headaches and seven figure charges.

Net net: Enterprise search is going to have to come up with some compelling arguments to gain momentum in a world of Big Data, open source, and once burned twice shy buyers. I wonder why venture / investment firms continue to pump money into what is same old search packaged with decades old lingo.

I suppose the idea that a venture funded operation like Attivio, BA Insight, Coveo, or any other company pitching information access will become the next Google is powerful. The problem is that Google does not seem capable of making its own enterprise search solution into another Google.

This is indeed interesting.

Stephen E Arnold, July 28, 2014

Big Data for Enterprise Logistics

June 26, 2014

The complex field of logistics and transport management is one that can surely benefit from data analysis. Inbound Logistics brings the benefits to the attention of its readers in, “Big Data Tools Enable Predictive and Prescriptive Analytics.” Writer Shannon Vaillancourt advises that, since the cost of implementing data systems has decreased, now is the time for companies to leverage these tools to understand and adjust their transportation patterns. He writes:

“By leveraging the big data tools that are becoming more prevalent, companies can quickly spot trends that would otherwise have gone unnoticed. Many people are under the impression that big data only refers to a large amount of data. The second definition of big data is that the dataset is too difficult to process using traditional data processing applications. When it comes to supply chain operations, many large companies are still dependent on using a spreadsheet to manage a very complex global part of the business.

“With big data tools, shippers can move past the business intelligence side of measuring and diagnosing, and move into the predictive and prescriptive side. A big data tool will allow transportation teams to have fewer experienced supply chain staff members, because the data will be more actionable.”

Stewart seems to acknowledge the shortfalls of current prescriptive algorithms; he reassures readers that the prescriptive side will be more useful as the technology evolves. Right now, we know it as the algorithm that tells us to buy more stuff at Amazon. Someday soon, though, it might accurately tell a manager which means of transport will most efficiently get a certain shipment to its destination.

It is interesting to watch as the big data trend spreads into different industries. As the hype fades, more of the truly useful applications will become clear.

Cynthia Murrell, June 26, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Free Trial of X1 Enterprise Client

June 3, 2014

X1 is offering a free fourteen-day trial of their desktop search engine, X1 Enterprise Client. Read more in the sneak preview:

“X1 Enterprise Client is a desktop search engine that automatically indexes files, email messages and contacts on your computer and returns instant results for your keyword searches. The results are organized in a tabbed interface, sorted by file type and provide a quick preview for most common file types including images, PDF files, Office files, ZIP files and many other formats. You can directly interact with the results by replying to emails, sending messages to contacts, opening files, playing music and also send any file as email attachment with the click of a button.”

This product could be a good investment for those who are not exactly careful as they label, name, and store files. Effective keyword search is the most useful tool in light of bad or nonexistent indexing. If you need a little more search in your workflow, and you do not want to be the one to impose the order, a solution like X1 Enterprise Client might be worth considering.

Emily Rae Aldridge, June 03, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Interview with Jeff Catlin on the Future of Enterprise Data

May 22, 2014

The interview titled Text Analytics 2014: Jeff Catlin, Lexalytics on Breakthrough Analysis may be overstating its case when it is billed as a breakthrough analysis. Most of the questions cover state-of-the-industry topics and Lexalytics promotion. Catlin offers insight into the world of enterprise data and the future of the industry. For example, when asked about new features for 2014 and the near future, Catlin responded,

“As a company, Lexalytics is tackling both the basic improvements and the new features with a major new release, Sallience 6.0 which will be landing sometime in the second half of the year. The core text processing and grammatic parsing of the content will improve significantly, which will in turn enhance all of our core features of the engine. Additionally, this improved grammatic understanding will allow us to be the key to detecting intention, which is the big new feature in Salience 6.0”

Catlin repeats in several of his answers that the industry is in flux, and that vendors can only scramble to keep up, even going so far as to compare 2013 and 2014 enterprise data to the Berlin Wall. He describes two “fronts”, one involving improving core technology, and the other focused on vertical market prospects.

Chelsea Kerwin, May 22, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

MarkLogic Platform Now Supports JavaScript, JSON and Node.js

May 6, 2014

We learn that MarkLogic has become more supportive from the announcement, “MarkLogic Enhances Enterprise NoSQL Database Platform with Support for JavaScript and JSON” at Yahoo Finance. The press release tells us:

“With support for JavaScript and JSON at every tier, developers can create applications using a language and data format they are familiar with while being assured that the database platform will pass the scrutiny of IT operations and risk management. JavaScript expands the current list of languages that MarkLogic supports, including Java, PHP, C#, Ruby, Python, C++, C, Perl, SQL, and XQuery.

“The MarkLogic Node.js client will give developers a simple and agile way to use MarkLogic, while server-side JavaScript support will deliver the ultimate in performance and flexibility. With JavaScript and JSON at the core of the schema-agnostic, horizontally scalable and elastic architecture, developers can build, revise and deploy applications across multiple systems consistently, while being able to combine JSON, XML, binary, text and Semantic triples in the same database.”

The write-up emphasizes that MarkLogic’s Enterprise NoSQL database platform already offers high-quality integrated search, disaster recovery, and “government-grade” security, among other advantages. See the post for details, or navigate straight to MarkLogic‘s site for more info. The company is headquartered in San Carlos, California, and maintains offices around the world. Some of its high-profile clients include Citigroup, Boeing, and Warner Brothers.

Cynthia Murrell, May 06, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

ArnoldIT Video: Search Brands Video

April 15, 2014

Whatever happened to Convera and the other four companies comprising the Top Five in enterprise search: Autonomy, Endeca, Fast Search & Transfer and Verity. The video also mentions Exalead and ISYS Search Software. The wrap up to the video points to three open source enterprise search options. For those who want to be reminded of the Golden Age of enterprise search, check out the free, six minute video from Stephen E Arnold, publisher of Beyond Search. Mr. Arnold is converting some of his research into brief, hopefully entertaining and useful free videos. You can access this short search history lesson at http://bit.ly/1etGExr. The next video in the series tackles the subject of buzzword, argot, jargon, lingo, and verbal baloney. What vendor is the leader in the linguistic linguini competition? The video will be available before the end of April. In the meantime, take a walk down memory lane and learn how Cornelius Vanderbilt obtained needed information in the early 19th century.

Kenneth Toth, April 15, 2014

Up and Down for IBM

March 26, 2014

There is good news and there is bad news for IBM. First up, a real journalist asks difficult questions about their high-profile project in “What’s Up with Watson?” at Gigaom. Writer Barb Darrow begins by noting that IBM likes to hold up its Jeopardy winner as an example of the company’s prowess. However, she explains, the much-lauded project seems to be floundering. There is the fact that its former leader left for newer pastures just four months after touting Watson‘s business potential. Darrow also reports:

“One problem I’ve been hearing about for a while is that while Watson is impressive technology, it is not really a product that’s easy to sell. IBM’s decision to open up APIs, to offer Watson’s smarts as a service, is one response to that. You make Watson available in more affordable portions, maybe it’ll gain traction….

“Sources close to IBM have said privately for some time that Watson has not hit internal targets for new business — no doubt one impetus for the new business unit. One said IBM wanted 100 new ‘logos’— big-name corporate wins in IBM parlance for Watson last year and was only able to close a handful.”

That’s the bad news. Meanwhile, though, the company has launched a contest that both spreads their name and inspires students worldwide (that’s the good news.) We learn that IBM is expanding their Master the Mainframe Contest, begun in 2006, in eWeek‘s “IBM Launches Master the Mainframe World Championship.” Reporter Darryl K. Taft tells us:

“The world championship competition is designed to assemble the best university students from around the globe, who have demonstrated superior technical skills through participation in their regional IBM Master the Mainframe contests….

“Of the 20,000 students who have engaged in country-level Master the Mainframe Contests over the last three years, the top 44 students from 22 countries have been invited to participate in the inaugural IBM Master the Mainframe World Championship.”

The contest is viewed as a way to get millennials excited about enterprise computing. For my part, I hope these young people will breathe some fresh air into the enterprise mainframe. See the article for details, or head over to the official Championship website.

Cynthia Murrell, March 26, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Ami Enterprise Intelligence Software

March 25, 2014

In a routine update process, one of the goslings came across Ami, a company that offered Ami Enterprise Intelligence 6.0. A quick review of the company’s Web site at www.amisw.com suggested that the company’s last update took place in 2013.

The flagship product in at Version 6.0. The company says:

Enterprise intelligence 6 is a platform for economic intelligence. Designed by AMI, [the system] includes separate modules for the acquisition, analysis and dissemination of information from external sources or internal company content. AMI Enterprise Intelligence is recognized by the community of business intelligence professionals as one of the platforms that ensures the most comprehensive and most innovative business intelligence.

In April 2013, just about one year ago, the company suggests that it participated in the International II SDV Conference. However, the link to the news item returns a 404 error.

Links to the company’s technology on its Web site are working as of March 25, 2014. The company lists four US patents for its core technology. The AMI patent portfolio consists of:

  • GMIL (Grammatical Markers Independent of Language) (# B-3851)
  • Enhancing online support (# B-3561)
  • Language Interface Natural (# B-3563)
  • Language interface for E-Commerce (# B-3562)

The list on the Ami Web site does not contain hyperlinks, however. The Crunchbase profile for the company and products has not been completed. See, for example, http://bit.ly/1jB16dC.

The company appears to be participating in Documation, March 26 in Paris at CNIT Paris la Défense. See http://bit.ly/1lj1eCV. The company appears to be participating in Documation, March 26 in Paris at CNIT Paris la Défense. See http://bit.ly/1lj1eCV. The company asserts that it has more than 150 customers.

The company, like Lextek, maintains a low profile, although it reports that it has offices in the United Kingdom and Paris.

Stephen E Arnold, March 25, 2014

Another Week, Another Enterprise Search System

March 21, 2014

Cloud? Check.

Azure chip consultant reference? Check.

Social angle? Check.

Support for distributed information? Check.

Consumerized interface? Check.

Reference to value? Check.

Automatic alerts? Check.

Customer reference? Check.

Big company pedigree? Check.

Open sourciness? Check.

Exotic technology? Check.

There you have the recipe for a new enterprise search system, at least according to eWeek’s “Highspot Brings Machine Learning to Enterprise Search.” Highpoint’s Web site describes itself this way:

Built for the cloud era, Highspot uses advanced machine learning to help organizations capture, share, and cultivate their most valuable working knowledge.

The pricing information, omitted from the eWeek story just as azure chip consultants omit enterprise search fees, begins at free and comes out of the gate at $20 per user per month or $240 per user per year. For an organization with 400 users, the annual fee works out to about $96,000 for an open source, machine-learning system, a bargain compared to the Google Search Appliance but more expensive than downloading Solr, Searchdaimon, or Elasticsearch and having one staff get search up and running. A less expensive option that works reasonably well is dtSearch, but you need to love the color blue for this search system. If you want an appliance, check out Maxxcat’s systems. These are far less expensive than other appliances, and the new systems are easy to set up and deploy. For cloud action, take a look at Blossom Software’s solution. Chances are your state, country, or municipal government is using the Blossom system built by a former Bell Labs’ whiz kid.

Net net: The enterprise search market is flooded with options. With big, waddling outfits like HP and IBM getting increasingly desperate to make their billion dollar bets pay off, you have high end options as well as free downloadable systems from organizations in Denmark, Norway, Russia, and elsewhere.

Will the pricing hold if a business licensee points the system at 50 million documents? My hunch is that there will be some fine print. Google charges about $900,000 for its appliance capable of processing tens of millions of documents with three years of support. You can check the latest US government discount prices at www.gsaadvantage.gov. Just search for “Google Search Appliance” and peruse the government’s price. A commercial price may vary.

The key is that the engines of many systems are open source. The “solution” is software wrappers and checklists that hit the marketing hot buttons. Keep up with Highspot via the company’s blog at http://blog.highspot.com/.

Stephen E Arnold, March 21, 2014

Lasting Truths about Enterprise Solutions

January 7, 2014

Since their inception, there have been many changes in the world of enterprise software. Yet, there are consistent truths that can guide users in the selection of enterprise solutions, depending on the individual context. Tony Byrne attempts a list of these truths in his article for Information Week, “6 More Enduring Truths About Selecting Enterprise Software.”

After discussions involving open source as well as large versus small vendors, Byrne turns his attention to the biggest option on the market, SharePoint:

“Long-suffering platforms like Lotus have continued to endure because of the strong community around them. For the same reason, SharePoint will probably endure long past the time people think fondly of it. In other words, your technology can become undead but remain viable due to external support and enhancements. Surely, that’s better than having a vendor or technology kick the bucket on you before you’re ready to migrate.”

Stephen E. Arnold of ArnoldIT.com is a longtime leader in search, including enterprise. He has found this same truth in his SharePoint coverage – SharePoint is staying on top of the market, but often because it is enhanced and propped up by a great variety of externals supports and enhancements.

Emily Rae Aldridge, January 7, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Next Page »