Elasticsearch: 70:30 Odds as the Next Big Thing in Search

March 28, 2014

We learned on March 26, 2014  suggesting that the German search vendor Intrafind has been looking for the next big thing. The company may have found it, and we expect that this low profile vendor will be plugging into the Elasticsearch power cable. Wikipedia already has, joining hundreds of other firms looking for a solution to doggy indexing in some other open source centric solutions.

Elasticsearch repackager SearchBlox has rolled out Version 8 of its hosted Elasticsearch system, according to Timo Selvaraj, Co-Founder/VP Product Management of SearchBlox.

As if these two recent developments were not enough, GoveWizely, a Washington, DC engineering services firm, has added Elasticsearch to its arsenal. GovWizely, operated by Erik S. Arnold (yep, that’s my boy) has moved adroitly to capitalize on the surging interest in Elasticsearch’s high performance system.

Contrast Elasticsearch’s rise as the go to open source enterprise search system with the struggles of other open source search vendor and some commercial outfits. LucidWorks has ingested $2 million in venture funding, according to Crunchbase. Elasticsearch has received $34 million in funding. Parity, right?

Not so “fast”. (A gentle nod to the fascinating proprietary system shoe horned by Microsoft into SharePoint.) Elasticsearch seems to be catching up to LucidWorks or winning the critical struggle for developers. Here’s the Elasticsearch pitch:


Understated and quiet, according to my engineering team. Could the developments at Intrafind, SearchBlox, and Adhere Solutions, among others, are an early warning system, Elasticsearch certainly could be the “next big thing” in search, enterprise and otherwise.

What’s this mean for the proprietary and non open sourcey vendors like Coveo, Funnelback, Lexmark ISYS, and Hewlett Packard? I would suggest that these firms’ management have to adapt to what appears to an emergent and disruptive force in information processing. If Elasticsearch does emulate the growth of the pre HP Autonomy, the likelihood that the millions of venture funding pumped into search funding and search acquiring may never be repaid. Chilling thought for some stakeholders who may have jumped on the wrong horse and seem compelled to continue to feed the nag fresh, expensive, non recoverable “clover.” (Think millions in hard cash funding with little to show that a payback is imminent or even possible.)

Read more

Google and Pricing: High Stakes WalMarting

March 26, 2014

I read a number of write ups about the new Google cloud pricing. The main idea, in my opinion, that  unifies the different reports is, “Everybody loves a bargain.” Consider “Google Slashes Cloud Prices: Google vs AWS Price Comparison.”

The essay-editorial begins with the invocation of the Google-Amazon joust:

Google threw down the gauntlet to challenge AWS public cloud supremacy by announcing significant price reductions across its Google Cloud Platform. The eye-opening price cuts covered compute (32-percent reduction), storage (68-percent reduction), and BigQuery (85-percent reduction). Google also signaled that future reductions could follow Moore’s Law — citing that historically public cloud prices have dropped only 6 to 8 percent annually as compared to 20- to 30-percent reductions in hardware prices.

The fact that neither Amazon nor Google provide much detail about their actual costs, profits, number of customers, and goals for their cloud services is not of much interest. Explanations of how pricing thresholds operate and migrate excite little curiosity.

Google, playing the Google Search Appliance card, seems to suggest that Amazon’s pricing is complicated. Yep, it is and it is very difficult to pin down with confidence what something will cost until the bits have been chomped and the Amazon accounting system processes its inputs and bills the customer. There is chatter about “sustained use” pricing, on demand pricing, and heavy reserved instance pricing, and in the article I have used as a pivot point for my comments, a cheer for RightScale’s services. These will help the cloud customer figure out what cloud computing costs.


See http://www.ftc.gov/tips-advice/competition-guidance/guide-antitrust-laws/single-firm-conduct/predatory-or-below-cost

Several observations:

First, the pricing is an example of the WalMarting of technical services. Doesn’t the entire world want lower prices? Once a market has been “won,” what happens? Creative destruction? I refer you, gentle reader, to WalMart’s challenges to rekindle (pun intended) that Sam Walton fire. The profit flat line is not good news to some WalMart stakeholders. But the Google pricing is little more than an old-fashioned price war in a Walton-like march for market share.

Second, Amazon has a bit of a cost problem. The murky Amazon financials, the hard to figure out side companies, and the blurring of revenues from product and services lines are tough to parse. Amazon is working overtime to generate no friction revenue (Prime pricing) and constrain costs. The results are a robust top line and growing pressure on expenses at “everyone’s favorite” online store. Google is cutting prices at a time when Amazon is maybe less than prepared for a price war.

Read more

Inflation or Desperation: Pricing Free Online Services

March 14, 2014

Yep, it’s illogical. How can a free online service get a price tag. Easy as Amazon’s boosting the fee for Prime and Facebook’s cooking up whizzy new types of advertising. But the big news is tucked between the lines of “Desktop Search to Decline $1.4 Billion as Google Users Shift to Mobile.”

Here’s a tasty factoid:

In the scope of Google’s overall ad revenues, mobile search is gaining significant share. Up from 19.4% in 2013, mobile search will comprise an estimated 26.7% of the company’s total ad revenues this year. Desktop search declined to 63.0% of Google’s ad revenues in 2013, having already fallen from 72.7% in 2012.

You may have noticed how lousy the search results are from Bing, Google, and Yahoo. Even the metasearch engines are struggling. Just run some queries on Ixquick.com or DuckDuckGo.com and do some results comparisons.

Because most of the world’s Internet users rely on Google to deliver comprehensive and accurate results, users are unaware of the information that is not easily findable. Investigators and professional researchers are increasingly aware that finding information is getting harder, a log harder if our research is on the beam.

As users shift from desktops to mobile the GoTo/Overture advertising model loses efficiency. There are a number of reasons, including the difficulty of entering queries while riding a crowded bus to the small screens to the dorky big type interfaces that are gaining popularity to the need to provide a brain dead single / limited function app to help a person locate pizza.

For Google and other desktop centric companies, the shift has implications for advertising revenue. Smaller screens and changing behavior means the old GoTo / Overture model won’t work. The impact on traditional Web sites is not good. Here’s a report for a company that did the search engine optimization thing, the redesign thing, and the new marketing “experts” thing. Looks grim, doesn’t it.


I won’t name the owner of this set of red arrows, but you can check out your own Web site and blog usage stats and compare your “performance” to this outfit’s.

Read more

Fast Redefined: The 2008 Search Acquisition Does a 365

March 9, 2014

Figure skating, anyone? You can do a Salchow jump. The skater has some options. Falling is not one of them. The idea is to leap from one foot to another. The Axel jump tosses is some spinning; for example, a triple Axel is 3.5 revolutions. Want creativity? The skater can flip, bunny hop, and Mazurka.

But the ice has to be right. Skating requires a Zamboni. Search requires information retrieval that works.


One should not confuse a Zamboni with an ageing ice skater.

Fast Search & Transfer has just come back from an extended training period and is ready to perform. The founder may be retired after an unfavorable court decision. The Fast Search Linux and Unix customers have been blown off. But, according to Fortune CNN, Microsoft has made enterprise search better. Give the skater a three for that jump called Office 365.

Navigate to “Can Microsoft Make Enterprise Search Better?” The subtitle is ripe with promise: “Updates to its Office 365 suite show benefits from a 2008 acquisition.” There you go. Technology from the late 1990s, a withdrawal from Web search, a run at unseating Autonomy as the leading provider of enterprise content processing, and allegations of financial wrongdoing and you have a heck of base from which to “make enterprise search better.”

At one time, Fast Search offered an alternative to Google’s Web search system. The senior management of Fast Search decided to cede Web search to Google and pursue dominance in the enterprise search market. Well, how did  that work out? The shift from the Web to the enterprise worked for a while, but the costs of customer support, sales, and implementation put the company in a bind. The result was a crash to the ice.

Microsoft bought the sliding Fast Search operation and embarked on a journey to make content in SharePoint findable. The effort was a boom to second tier search vendors who offered SharePoint licensees a search and retrieval system. Most of these vendors are all but unknown outside of the 150 million SharePoint license base. Others have added new jumps to their search routines and have skated to customer support and business intelligence.

Read more

News, Optimism, and Content Marketing

February 26, 2014

I read “How Covert Agents Infiltrate the Internet to Manipulate, Deceive, and Destroy Reputations.” Public relations may need to do some PR and damage control. The allegedly accurate information provided one more factoid to support our contention that locating and verifying “news” is a tough job.

I will be addressing some of the methods a researcher can use to unwrap the ballistic padding that online services use to keep some information away from the grubby fingers of researchers. Consumers who gobble pay-top-play content are what most online services want. And, if you had not noticed, putting video content front and center is the new trend for those who are looking for facts, data, and high-value analyses.


As Kim Kardashian allegedly said, “I’m an entrepreneur. Ambitious is my middle name.”

The blog post “The Future of the News Business: A Monumental Twitter Stream All in One Place” was more interesting to me. The write up presses some familiar controls on the baloney making machine; for example:

  1. Consolidation is much better than individual services. I wonder if “consolidation” is a euphemism for monopoly, a concept with which some executives are more familiar. An older-school thinker used the word “convergence” but that buzzword makes an appearance in the source article.
  2. The time horizon is not three years (a long time in today’s uncertain world). The time horizon is 20 years in the future. I wonder how far in the future Viktor Yanukovych’s chief of staff planned yesterday. I think the plans are on hold for a while.
  3. The old way of news was monopolistic. The new way is to generate money from many streams; for example, advertising (good), Bitcoin (possibly problematic), and slicing and dicing (a possible copyright quagmire).
  4. The beacons range from Buzzfeed (listicles) to SearchEngineLand (the logic straining search engine optimization service described as “a place for all the search news, all the time.”)

The opportunity, if I follow the argument, is to tackle the job of creating a monumental Twitter Stream all in one place” with vision, scrappiness, experimentation, adaptability, focus, deferred gratification, and an entrepreneurial mindset.

I appreciate the elegant quote from Tommy Lasorda about how difficult creating a news-oriented “monumental Twitter stream” will be. My hunch is that a fusion of PR methods, content marketing, and “bits are bits” thinking will triumph.

Read more

Elasticsearch Disrupts Open Source Search

February 17, 2014

I did a series of reports about open source search. Some of these were published under mysterious circumstances by that leader of the azure chip consultants, IDC. You can see the $3,500 per report offers on the IDC site. Hey, I am not getting the money, but that’s what some of today’s go go executives do. The list of titles appears below my signature.

Elasticsearch, a system that is based on Lucene, evolved after the still-in-use Compass system. What seems to have happened in the last six months is one of those singularities that Googlers seek.

In January 2014, GigaOM, a “real news” outfit reported that Elasticsearch had moved from free and open source to a commercial model. You can find that report in “6 million Downloads Later, Elasticsearch Launches a Commercial Product.” The write up equates lots of downloads with commercial success. Well, I am not sure that I accept that. I do know that Elasticsearch landed an additional $24 million in series B funding if Silicon Angle’s information is correct. Elasticsearch, armed with more money than the now aging and repositioning Lucid Works (originally Lucid Imagination) has. (An interview with one of the founders of Lucid Imagination, the precursor of Lucid Works is at http://bit.ly/1gvddt5. Mr. Krellenstein left Lucid Imagination abruptly shortly after this interview appeared.)


I noted that in February  2014, InfoWorld, owned by the publisher of the $3,500 report about Elasticsearch, called the company “ultra hip.” I don’t see many search companies—proprietary or open source—called “hip.” “Ultra Hip Elasticsearch Hits Commercial Release.” The write up asserts (although I wonder who provided the content):

Elasticsearch was originally spun off from the Compass project, an open source Java search engine framework, back in 2004, in an effort to create a highly scalable search solution. Built on top of the well-known and popular Lucene library from the Apache Software Foundation, Elasticsearch adds such features as multitenancy, sharding, faceted search, and a JSON-based REST API. This feature set puts it in competition with the Solr project as a complete search solution built on top of Lucene.

The statement does not hit what I thought are the main points of the Elasticsearch initiative. let me fill in the blanks. Perhaps an azure chip consultant can use these to whip up another $3,500 report?

Read more

Google Puts Some Effort into the Google Search Appliance

February 12, 2014

Last I knew, the Google Search Appliance (GAS) had trimmed its product line, eliminated the impulse buy option for the Mini, and kept the price at the higher end of the appliance market.

I learned over the last two years that Google has placed more than 60,000 GSAs in organizations. I have no idea if the number is valid, but if it is, the GSA is one of the top dogs in enterprise search. I also heard that there was a small team working on the GSA and an even smaller team handling customer support. Google pushes functions to resellers who deal with the customers. Google outsources manufacturing of the GSA. Most important, Google seems to have an off-again, on-again interest in on premises search. The future, as I understand it, is the cloud. The GSA is, in my opinion, an anachronism in the Nest, X Labs, and Android-Chrome world. But, hey, I have been wrong before. I once asserted that basic search should not be a challenge for most organizations. Wow, did I get that wrong! Jail time, law suits, and DARPA’s almost admission that search is not working notwithstanding.


The GSA has been around almost a decade. Version 7.2 is “a leader in the Garnet Enterprise Search MQ.” I certainly don’t doubt the word of an estimable azure chip consulting firm. No, no, no.

The new version, according to Google, delivers:

  • Metadata sorting. A function available in the 1983 version of Fulcrum Technologies’ system
  • language translation. A function available from Delphes in the 1990s
  • A document preview function. iPhrase in 1999 delivered this feature
  • Entity recognition. Verity implemented this function in the 1980s
  • Dynamic navigation. Endeca rolled out this feature in 1998

In my opinion, the GSA is catching up to innovations available for many years from other vendors. Comparing the EPI Thunderstone and Maxxcat appliances to the GSA emphasizes that the GSA is not quite at parity with other products in the channel.

According to “Google Updates Enterprise Search Appliance Tool,”

The GSA 7.2 update comes more than a year after the firm upgraded the GSA to version 7.0, and builds on the features included in that update. The most notable includes the ability to improve the way data can be indexed with key attributes, such as author name, or the date it was created.

How much does a GSA cost? According to the US government’s GSAadvantage.gov, a 36 month license for a GB 7007 is $69,296 for 500,000 documents. Have more documents? Pay for an upgrade. However, I can use a hosted service like Blossom Software to index my content for about $2,400 per month. I can use the low cost dtSearch solution for $160 per seat. I can download an open source solution and do it myself.

For an organization with 20 million documents to index, the cost of the GSA solution noses into HP Autonomy territory. Too rich for my blood, and I think that lower cost appliance vendors will see the Google Search Appliance as a lead generator.

I wonder if those azure chip consultants have licensed the GSA to handle their Intranet information retrieval tasks?

Stephen E Arnold, February 12, 2014

A Formula for Selling Content Processing Licenses

January 23, 2014

Do equations sell? Some color:

I know that I received negative feedback when I described the mathematical procedures used for Google’s semantic search inventions. I receive presentations and links to presentations frequently. Few of these contain mathematical expressions. In my forthcoming no-cost discussion of Autonomy from 1996 to 2007, I include one equation. I learned my lesson. Today’s search and content processing truth seekers want training wheels, not higher level math. I find this interesting because as systems become easier to use, the fancy math becomes more important.

Anyway, imagine my surprise when I received a link to a company founded 14 years ago. The outfit does business as Digital Reasoning, and it competes with Palantir (a segment superstar), IBM i2 (the industry leader for relationship analysis), and Recorded Future (backed, in part, by the Google). Dozens of other companies chase revenues in this content processing sector. Today’s New York Times includes a content marketing home run by an outfit called YarcData. You can find this op ed piece by Tim White on page A 23 of the dead tree version of the paper I received this morning (January 23, 2014). Now that’s a search engine optimization Pandas and the Times’s demographic can love.

To the presentation. My link points to Paragon Science at http://slidesha.re/1jpXAGd. I was logged in automatically, so you may have to register to flip through the slide deck.

Navigate to slides 33 and following. Slides 1 to 32 review how text has been parsed for decades. The snappy stuff kicks in on page 33. There are some incomprehensible graphics. These Hollywood style data visualizations are colorful. I, unlike the 20 somethings who devour this approach to information, have a tough time figuring out what I am supposed to glean.

At slide 42, I am introduced to “dynamic cluster analysis.” The approach echoes the methods developed by Dr. Ron Sacks-Davis in the late 1970s and embedded in some of the routines of the 1980 system that a decade later became better known as InQuirion and then TeraText.

At slide 44, the fun begins. Here’s an example which I am sure you will recall from your class in chaos mathematics. If you can’t locate your class notes, you can get a refresher at http://bit.ly/1mKR3G9 courtesy of Cal Tech, home of the easy math classes as I learned during my stint at Halliburton Nuclear Utility Services. The tough math classes were taught at MIT, the outfit that broke new ground in industry sponsored educational methods.


Read more

Yale on Free Expression: A Quote to Note

January 18, 2014

Years ago I gave a lecture at Yale. My subject was Google. I ran through the basic points in The Google Legacy and Google Version 2.0. The audience reacted as if I had dissected a dead frog. I received a smattering of polite applause and headed out for a talk in New York City. So much for Yale and the idea that Google was more than a Web search company.

I just read “Yale Students Made a Better Version of Their Course Catalogue. Then Yale Shut It Down.” A couple of students put up a Web page that allowed students to pinpoint classes and compare student ratings of professors. Sounds like an app to me.

Information? Who said it was supposed to be free? Image source: http://1.usa.gov/1dFIhW9

But Yale perceived the Web page differently. Here’s the quote:

‘Yale’s policy on free expression and free speech entitles no one to appropriate a Yale resource and use it as their [sic] own ,’ the statement read. It further stated its main priority at this time was supporting its own resources, ‘not others created independently and without the university’s cooperation or permission,’ and that ‘all the information on the website remains available to students on the Yale site.’

I assume the Washington Post is semi-accurate, just like an Amazon recommendation.

What did the future bonesmen learn? A nuance of academic freedom in Yale Land has been broadcast in an analogue transmission.

Will these two free thinkers demonstrate digital initiative in the future? Is Yale turning out well-trained online researchers for the next-generation information highway?

Stephen E Arnold, January 18, 2014

Distraction Addiction: Welcoming Predictive Search Systems

January 9, 2014

The article on Business Insider titled Here’s How Many Times People Switch Devices In a Single Hour provides insight into the studies being undertaken by both Google and Facebook into following users from device to device. They need to demonstrate to advertisers that the ad one user saw on his laptop at work later caused him to make a purchase from his smartphone. The article states

“A new study from the British unit of advertising buyer OMD shows just how massively important this cross-device tracking has become to monitoring a given consumer’s behavior.

In looking at the behavior of 200 Brits during one evening, OMD found that the average person shifted his attention between his smartphone, tablet, and laptop a staggering 21 times in one hour.”

This study’s findings may not come as huge surprise. An article on Salon titled How Baby Boomers Screwed Their Kids and Created Millennial Impatience argues that the Generation Y is the most distracted and impatient batch of people yet. The article contends,

“According to a study at Northwestern University, the number of children and young people diagnosed with attention deficit hyperactivity disorder (ADHD) shot up 66 percent between 2000 and 2010. Why the sudden and huge spike in a frontal lobe dysfunction over the course of a decade… What I believe is likely happening, however, is that more young people are developing an addiction to distraction. An entire generation has become addicted to the dopamine-producing effects of text messages, e-mails and other online activities.”

This “addiction to distraction” is often held up by Gen Y’ers as an ability to “multi-task”. But what does it mean to be someone unable to focus? In Buddhism there is the belief that if you are doing more than one focused task, you are not truly alive.

With telework, the workplace is now the world.

We have all succumbed at one time or another to the call of checking our e-mail, Facebook, or Twitter account, but when we are doing it so often that it takes over our concentration, what have our lives become? There is a wide gap between flitting from these exciting distractions and actually gaining some foothold of understanding. And the more we do jump back and forth between tasks, the less likely it becomes that any knowledge is created or stored. The Salon article paints a bleak picture, starting off with the dark Philip Larkin poem “This Be the Verse” (it is hardly “High Windows”) and including this dreary image of the future,

Read more

Next Page »