Embracing Open Source Only To Make Money

February 3, 2013

Open source offers companies many advantages: software tailored specifically to their needs, no licensing fees, and the support of an entire community. IBM was one of the big companies who adopted an open source policy and others have been following suit. According to Marketwire, Expert System is another business adding open source says the article, “Expert System Announces Integration With Apache Solr For Enterprise Search.”

Expert System is a semantic software company that provides insights into its clients’ information. For its Cogito semantic platform, Expert System installed Apache Solr, an open source enterprise search platform. The goal is that Apache Solr will give clients more precise search results and access to big data and enterprise content.

“’As more organizations recognize the opportunity presented by their information streams, it is important they understand that there are advanced tools that can improve the performance of their existing enterprise content and search investment,’ said Luca Scagliarini, Vice President of Strategy & Business Development, Expert System. ‘Semantic technology not only excels in making search and information management more accurate, but it also allows organizations to improve the quality of their information for use in the decision making process.’”

Great! Take advantage of open source and use it to deliver a better quality product to customers. That is how open source should be used (as long as Expert System gives something back in return), but there is a problem here. As more companies follow IBM’s open source approach, they seem to be forgetting that IBM is a consulting company and not a software/hardware company. Adopting open source may not build revenue, instead (without the right plan) it will simply create bigger IT headaches.

Whitney Grace, February 3, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Search and Innovation

February 1, 2013

I don’t want to rain on the innovation parade. However, another search lawsuit is upon us. “Microsoft Sued over Search-Related Patents” reports that the alleged infringement relates to advertising. In My March 2013 Cebit Promise talk I comment about the loss of innovation in search. This new legal dust up makes it clear that the focus in search is on the dance among the search system, the user, and the advertiser. In short, innovation is not precision and recall in the manner of dusty equations. Perhaps innovation is the dutiful servant of revenue and legal eagles?

Stephen E Arnold, February 1, 2013

Sponsored by HighGainBlog

Facebook Graph Search No Threat to Google Search

February 1, 2013

Contrary to some early predictions, it looks like Google has nothing to worry about from Facebook’s just-released “graph search” function. The Manila Times reports, “Facebook’s New Search Product Not Threat to Google – Analysts.” The brief write-up reports:

“After Facebook rolled out the friends-based search product on Tuesday, people began thinking about the question of how this new feature could affect Google, the king of search. Facebook CEO Mark Zuckerberg said that ‘graph search’ is different from an all-purpose search engine. His view was agreed by experts, who said that compared with Facebook’s focus on the network of friends, the search function of Google takes a much more holistic approach. Analysts agreed that Facebook’s search tool is unlikely to challenge Google’s leading position in web search at least in the near future.”

The new feature allows users to tap into opinions and recommendations expressed by their “friends” when searching for information. Our own leader, Stephen E. Arnold, has observed that it functions better for some folks than for others, and that the less superficial the search, the less useful it is. Thanks, but no thanks.

If you’re getting a sense of déjà vu, it may be because of similar social-linked moves last year by Microsoft and, yes, Google itself. Microsoft tied recommendations from Foursquare into their Bing results, while Google connected Google+ data with its search (opting out is possible). All three implementations seem like either-love-it-or-hate-it propositions. But, hey, all is well as long as the advertisers are happy.

Cynthia Murrell, February 01, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Exclusive Interview: Miles Kehoe, LucidWorks

January 30, 2013

Miles Kehoe, formerly a senior manager at Verity and then the founder of New Idea Engineering, joined LucidWorks in late 2012. I worked with Miles on a project and found him a top notch resource for search and the tough technical area which was our concern.

I was able to interview Miles Kehoe on January 25, 2013. He was forthcoming and offered me insights which I found fresh and practical. For example, he told me:

You know I come from a ‘platform neutral’ background, and I know many of the folks involved with ElasticSearch. Their product addresses many of the shortcomings in Solr 3.x, and a year or two ago that would have been a coup. But now, Solr 4 completely addresses those shortcomings, and then some, with SolrCloud and Zoo Keeper. ES says it doesn’t require a pesky ‘schema’ to define fields; and when you’re playing with a product for the first time, that is kind of nice. On the other hand, folks I know who have attempted production projects with ES tell me there’s no way you want to go into production without a schema. Apache Lucene and Solr enjoy a much larger community of developers. If you check the Wikipedia page, you’ll see that Lucene and Solr both list the Apache Software Foundation as the developer; Elastic Search lists a single developer, who it turns out, has made the vast majority of updates to date. While it is based on Apache Lucene, Elastic Search is not an Apache project. Both products support RESTful API usage, but Elastic requires all transactions to use JSON. Solr supports JSON as well, but goes beyond to support transactions in many formats including XML, Java, PHP, CSV and Python. This lets you write applications to interact with Solr in any language and with any protocol you want to use. But the most noticeable difference is that Solr has an awesome Web Based Admin UI, ES doesn’t. If you’re only writing code, you might not care, but the second a project is handed over to an Admin group they’re bound to notice! It makes me smile every time somebody says ES and “ease of use” in the same sentence – you remember the MS DOS prompt back in 1990? Although early adopters enjoyed that “simplicity”, business people preferred mouse-based systems like the Mac and Windows. We’re seeing this play out all over again – busy IT people want an admin UI – they don’t want to spend all day at what amounts to a “web command line”, stitching together URLs and JSON commands.

I found this comment prescient. I learned about a possible issue triggered by ElasticSearch in “Github Search Exposes Passwords Then Crashes.”

I pressed Mr. Kehoe for key points of differentiation in open source search. I pointed out that every vendor is rushing to embrace open source search. Some do it with lights flashing like IBM and others operate in a lower profile manner like Attivio. He told me:

Just as we have different products and services for our customers, we can customize our engagements to meet our customers’ needs. Some of our customers want to have deep product expertise in-house, and with training, best practice and advisory consulting, and operations/production consulting, we help them come up to speed. We also provide ongoing technical and production support for mission critical applications – just last month an eCommerce site ran into production problems on the Friday afternoon before Christmas. We were able to help them out and have them at full capacity before dinner. Not to dwell on it, but what sets LucidWorks apart is the people. We employ a large number of the team that created and enhances Lucene and Solr including Grant Ingersoll, Steve Rowe and Yonik Seeley. We also have significant expertise on the business side as well. At the top, Paul Doscher grew Exalead from an unknown firm into a major enterprise search player over just a few years; my former business partner Mark Bennett and I have built up deep understanding of search since our Verity days in the early 1990s.

Important information for those analyzing search systems I believe.

You can read the full text of the interview on the ArnoldIT Search Wizards Speak series at http://goo.gl/31682. Search Wizards Speak is the largest, no cost, freely available collection of interviews with experts in search and content processing. There are more than 60 interviews available. You can find the full series listing at http://www.arnoldit.com/search-wizards-speak/ and http://arnoldit.com/wordpress/wizards-index/.

Stephen E Arnold, January 30, 2013

Sponsored by Dumante.com

Apache Lucene and Solr New Codec

January 30, 2013

Apache Lucene and Solr have announced the new release of version 4.1. Improvements to Solr’s request parsing and support of Internet Explorer are just a few of the new features available. Read about all of the new features and upgrades in The H Open article, “Apache Lucene and Solr Update with New Default Codec.”

The article begins:

“The Apache Lucene project has announced Lucene and Solr 4.1, the latest updates to the Java-based text search library and search platform built around it. Lucene 4.1 has a new default codec “Lucene41Codec” which is based on a previously experimental “Block” indexing format. The new codec includes optimisations around pulsing (where a term only appears in one document) and efficient compressed stored fields to help keep data within the bounds of I/O cache.”

Lucene and Solr serve as the basis for many strong enterprise products. LucidWorks is one company that builds its solutions atop Lucene and Solr, ensuring that they are harnessing the best and most current open source advancements. Check out LucidWorks Big Data and/or LucidWorks Search – both are sure to get even better, benefiting from the improvements in Lucene and Solr’s new codec.

Emily Rae Aldridge, January 30, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Graph Search Makes Facebook Rival Google

January 30, 2013

Facebook’s search application has never been very strong. Yandex’s Wonder application has urged Facebook to bump up its search development and launch the new Graph Search. Steve Cheny’s blog takes an in depth look at the new Graph Search in his post: “Graph Search’s Dirty Promise And The Con Of The Facebook ‘Like.’” Graph Search is supposed to compete with Google and allow users to search all of the content on their social networks. Cheny says that Graph Search is much weaker than Facebook wants to admit and most of the data it searches is outdated.

Cheny explains that Facebook has convinced companies that they need to buy fans, meaning “likes” on Facebook. Facebook’s users are not its customers, rather these companies are and they have spent 50% of their advertising budget on Facebook campaigns. All of this produces a lot of data and connections, but Cheny argues that it will not meet users’ real needs.

“The truth is Graph Search deserves the exact disclaimer FB gave it… it’s a beta product. Through time, iteration, and effort it can and will be a useful tool for FB power users who are well connected, to find people and to sift through memories. But the fact is we’re living in a web where services are unbundling, and social is unbundling too. You simply can’t roll up recommendations for people, places, and interests into a service that’s one size fits all. “

Of course Graph Search is a beta. It will not decide what you do, only try to influence your decision. Facebook have you failed in search?

Whitney Grace, January 30, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Quote to Note: Craziness about Facebook Search

January 29, 2013

Here’s a quote to note. I don’t want to lose this puppy. I spotted it in the dead tree edition of the New York Times. The location of this notable phrase is the business section, page B 7. The story containing the quote is “Facebook’s Search Had to Go Beyond Robospeak.” The story explains the wonderfulness of Facebook’s beta search system. We love Facebook search. How could the company possibly improve on a graph surfing system which blocks outfits like Yandex from indexing content. No way. Anyway, here’s the quote:

Letting users talk with a computer on their own terms.

Oh, baby. Do I love this type of insightful comment about search and retrieval. I was not aware that I was able to talk with Facebook, but what do I know. Even better I live the idea of doing the talking on my own terms.

How interesting is this statement about letting users talk with a computer? Beyond interesting. The statement ventures into the fantasyland of every person who watched and confused Star Trek, Star Wars, and Mary had a little lamb.

A keeper.

Stephen E Arnold, January 29, 2013

Check out our sponsor Dumante.com

Oracle Endeca, More Oracle Than Endeca

January 27, 2013

In December 2012, Boston.com reported on the “Endeca exodus.” You can get the scoop by reading “As Endeca Exodus Continues, Trio of Former Employees Start Salsify to Help Manufacturers Distribute Better Product Info.” The content marketing play is not the part of the article which I found interesting. Here’s what I wanted to capture:

Co-founders Steve Papa and Pete Bell have both left Oracle as of this month. Former Endeca SVP Chris Comparato is now at Acquia, the Burlington company that peddles web content management software. Others have left for PayPal Boston, Silver Lining Systems, Sqrrl, Hopper, Internet advertising company DataXu, and Lookout Gaming, a new startup.

Why not cash in and check out? It is tough work making a search system generate revenue and even more difficult to achieve revenues and make a deal to sell the company to a larger firm.

The big outfits who buy search engines have the opportunity to learn first hand exactly how difficult it is to:

  1. Build revenues and turn a profit
  2. Find the resources to keep the software working
  3. Figure out how to market in a way that does not end in a flame out.

With these changes, Endeca is now Oracle. Can Oracle become more like the pre-acquisition Endeca which caught the attention of Oracle in the first place? Worth watching.

Stephen E Arnold, January 27, 2013

If you are interested in gourmet food and spirits, read Gourmet De Ville.

Computer Automation Is Making Researchers Obsolete

January 26, 2013

In archives and libraries around the world, piles of historic documents are sitting gathering dust. One of the problems librarians and archivists have with these documents is that they do not have a way to historically date them. The MIT Technology Review may solve that problem, says the article, “The Algorithms That Automatically Date Medieval Manuscripts.” Gelila Tilahun and other people from the University of Toronto have created algorithms that use language and common phrases to date the documents. Certain words and expressions can date a document to a specific time period. It sounds easy, but according to the article it is a bit more complex:

“However, the statistical approach is much more rigorous than simply looking for common phrases. Tilahun and co’s computer search looks for patterns in the distribution of words occurring once, twice, three times and so on. “Our goal is to develop algorithms to help automate the process of estimating the dates of undated charters through purely computational means,” they say. This approach reveals various patterns that they then test by attempting to date individual documents in this set.    They say the best approach is one known as the maximum prevalence technique. This is a statistical technique that gives a most probable date by comparing the set of words in the document with the distribution in the training set.”

Tilahun and his team want their algorithms used for more than dating old documents as well. It can be used to find forgeries and verify authorship. The dating tool opens many more opportunities to explore history, but the down side is that research is getting more automated. Librarians and scholars may be kicked out and sent to work at Wal-Mart.

Whitney Grace, January 26, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Forrester Fills the Gap in Search Market Size Estimates

January 25, 2013

I used to enjoy the search market size estimates of IDC (the time it takes to find info group), Gartner (the magic quad folks), Forrester (yep, the “wave” people), and Ovum (we do it all experts), among others.

I read “Growth of Big Data in Businesses Intensifies Global Demand for Enterprise Search Solutions, Finds Frost & Sullivan” and found several items of interest in the brief news story which arrived via Germany. Is Germany a leader in enterprise search? I heard that 99 percent of Germany’s search means Google. The numerous open source players are not setting the non-German world on fire, but I could be wrong. Check out GoPubMed, for example, of an interesting system which has a modest profile.

Now to the size of the search market.

The first thing I noticed was the nod to Big Data, which is certainly the hook on which many dreams for Big Money hang. With enterprise search vendors looking for a way to gain traction in a market which has been caught in awkward positions when licensing and deploying “search,” new words and new Velcro patches are needed. I won’t mention the Hewlett Packard Autonomy matter nor the Fast Search & Transfer matter nor the millions pumped into traditional search vendors with little chance of paying back the investments. No. No. No.

I want to quote this statement from :

The growth of Big Data across verticals presents the enterprise search solutions market with further opportunities. Since newer data types are not confined to a relational database within an organization, solutions that can search information outside the scope of these relational frameworks are widely accepted. Demand for personalized search tools that operate in a pool of unlimited data from internal servers, the Internet, or third-party sources is also growing.

Ah, but how does one crawfish away from exaggeration? Easy. I noted:

However, the disparity between customer expectations and actual search outcomes could dissuade future investments. Customers expect a single query to retrieve the right results immediately. Therefore, search providers must offer timely and relevant results, taking into account the continuous addition of new data to repositories.

But “How big is the market? my inner child yelps. The answer:

Read more

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta