Exclusive Interview: Hadley Reynolds, IDC

August 18, 2010

One of the big guns in search and content processing consulting is IDC, a firm to which many senior managers turn when trying to figure out the fast-changing world of digital information. IDC describes itself as:

the premier global provider of market intelligence, advisory services, and events for the information technology, telecommunications, and consumer technology markets. IDC helps IT professionals, business executives, and the investment community make fact-based decisions on technology purchases and business strategy. More than 1000 IDC analysts provide global, regional, and local expertise on technology and industry opportunities and trends in over 110 countries worldwide. For more than 46 years, IDC has provided strategic insights to help our clients achieve their key business objectives. IDC is a subsidiary of IDG, the world’s leading technology media, research, and events company.

I go along with the description because every once in a while I do a project for IDC, and I do like to get paid for my research.

HR_ColorCrop1

Hadley Reynolds is a featured speaker at the Lucene Revolution Conference, Boston, Mass., October 7 and 8, 2010. Information about the conference is available at http://www.lucenerevolution.com.

Mr. Reynolds focuses on understanding business transformation opportunities through the application of search technologies to traditional business models and the role search innovation is playing in creating new business opportunities. Prior to joining IDC, he guided the Center for Search Innovation at Microsoft Fast. A former Delphi Group consultant, he brings business and technical expertise to bear on his work in search.

I tracked down Hadley Reynolds, one of IDC’s senior consultants in the search practice. We conversed in Framingham, Massachusetts. We talked at the Starbuck’s in Framingham. I have transcribed our talk in the dialogue below:

Hadley, thanks for taking the time to talk with me?

Glad to. I read your blog and I must say that I don’t agree with some of your points.

No problem. The blog is designed to spark discussion, not compete with the work you and your colleagues do at IDC.

We do not consider your comments on the world of search competitive in any way. Go for it.

Okay, let me get right to the point. I noticed that you are on the program for the Lucene Revolution Conference. Why the interest in open source software?

Anyone who has been watching the enterprise search market in recent years has to be impressed with the growth record of the Lucene: in technical enhancements, community participation, and acceptance in the commercial market.

Enterprise search is usually a proprietary solution. Open source relies on a community. Is the community angle real or a myth?

When the Lucene project was in its early days, I was highly skeptical about the likelihood of a community of sufficient horsepower coalescing around it and carrying forward Doug Cutting’s foundational work. However, with IBM and other smaller commercial players like Attivio, Palantir, i2, Lucid Imagination, and others both using and contributing to the codebase, plus continuing contributions from search “independents”, I think the skepticism about critical mass and future support can be laid to rest.

I see some companies playing what I call the “open source card”. A couple of examples are IBM and Oracle. IBM uses Lucene/Solr in OmniFind 9.1, and Oracle bought Sun Microsystems and then MySQL. Won’t that confuse enterprise customers?

The trend in IT is toward more open source software, not less. First it was accepted that common resources like the Linux operating system and the Apache Web Server and the Mozilla browser were fine alternatives for the enterprise. Then we saw open source databases like MySQL and others, and multiple content management systems like Alfresco, Drupal and many more. So it’s not surprising that we now have successful open source search in Lucene/Solr and that commercial companies who utilize search as a component, and even those that sell search-based applications, would want to take advantage of the strength of these products both for themselves and for their customers.

I am interested in Lucene/Solr but I also track Drupal, Hadoop, and other open source projects. What are benefits you see in open source software?

The benefits of using Lucene/Solr are not hard to enumerate: rapid time-to-market, high-level functionality, flexibility and customization, low entry level cost, positive future outlook for technology evolving with the state of the art.

I agree. But the reason there is proprietary software is to deliver a solution with “one throat to choke.” What are the drawbacks of open source software where no one may be responsible for a code fix, a new function, or a code widget?

The drawbacks include: enhancements on a community timetable only, potentially expensive customization, requirement for advanced skills in-house or near-at-hand, delivered functionality will trail the truly cutting edge search software specialist firms, and system life costs can become significant.

When I think of IDC I think of commercial software. With this open source “revolution” on display in the promotion for the Lucene Revolution Conference and the stepped up marketing of the conference sponsor, Lucid Imagination, something is happening and it may not be the traditional enterprise play. When someone asks you why you don’t use a commercial search solution, what do you tell them?

I believe that for many search problems in the enterprise, the smart approach is to select the tool that gets the job done with the least delay and at the most reasonable investment level. More and more frequently, open source search software is becoming the tool of choice.

Wow it looks like your phone is going to jump off the table.

Right, I need to get back.

How do people contact you?

Your readers can get me by sending an email to  hreynolds at idc dot com.

Stephen E Arnold, August 18, 2010

Sponsored post. Okay?

Exclusive Interview: Erik Arnold, Adhere Solutions

August 9, 2010

How does one consultant interview another? Cautiously. How does a father interview a son? Buy the Diet Coke and provide the questions before flipping on the digital recorder. I spoke with Erik Arnold, managing director of Adhere Solutions in a charming eatery in Chicago with a buzzing neon sign advertising “Free Refills” on Sunday. Today is Monday. Now some readers wonder if I write about my son and get paid for that work. Anyone who has a successful son knows that fathers get to pay. What’s my compensation? When you have a gosling flying circles around your goose pond, you will figure it out.

ErikSArnold

Erik Arnold, managing director of Adhere Solutions, will be giving a talk about the use of open source search technology for the White House’s USA.gov Web site.

Erik Arnold has over 15 years of experience in the search industry, divided uniquely between both Web and enterprise search. Adhere Solutions is a consulting firm that advises companies on improving their search systems. Prior to Adhere, Erik served as a subject matter expert for a government consulting company where he primarily worked with the House of Representatives and the USA.gov web portal. He started his career at Lycos, one of the first Internet search engines, where he was a product marketing manager. Erik then moved to NBCi search engine (Snap.com) where he served as business development manager.

He will be giving a talk about the impact of open source search on certain US government initiatives at the October 2010 Lucene Revolution Conference.

The full text of the interview appears below:

Here we are again talking about search technology.

That’s right.

For readers who may not know about your company, what’s an Adhere Solutions?

Adhere Solutions offers products and services that help organizations with their of search systems. We focus on Google and open source technologies. Adhere Solutions has been a trusted Google Enterprise Partner since 2007, with a client roster of Wal-Mart, Lexis-Nexis, and the Federal Trade Commission among others.

You have worked on the USA.gov and related Federal projects. When did you get into this type of work?

A decade ago. I think I did my first Federal consulting job in 2000 for the Clinton Administration.

Read more

Exclusive Interview: Bill McQuaide, Black Duck Software

August 3, 2010

The Lucene Revolution is 10 weeks away. October 7th and 8th, 2010, to be exact. One of the companies participating is Black Duck Software. The company’s Open Source Resource Center http://www.blackducksoftware.com/oss has been of considerably utility to me in my work on the Lucene/Solr centric conference AS HAS its code search Web site koders.com. Black Duck also offers a free version of its enterprise code search product, Code Sight. The company has a wide range of for-fee software products. If you work with open source, you will find that the firm’s Black Duck Suite may be indispensible to you and your development team for managing and controlling the myriad open source components in your code base. I wanted to know more about this company, so I contacted Bill McQuaide, Black Duck’s Executive Vice President of Products and Strategy. Mr. McQuaide has 20 years of technology experience and executive leadership spanning engineering, product management, marketing and business development. He comes to Black Duck after spending 10 years with RSA Security during which time the company experienced rapid growth. RSA Security was acquired by EMC Corporation in 2006. He most recently served as a Senior Vice President of the Enterprise Solutions Group and Corporate Development. His experience includes four years at Hewlett-Packard, where he led the Product Management and Channel Development teams for the company’s Technical Systems Division. The full text of our interview appears below:

image

Bill McQuaide, Black Duck Software’s Executive Vice President of Products and Strategy.

My logo is a goose. Yours is a black duck. What’s Black Duck Software?

There is a story behind the name of the company. It goes back to our founder, Doug Levin, who found and nursed a black duck back to health when he was seven. He raised the duck as a pet. Doug decided to name the company Black Duck Software. Of course black ducks are reputed to be very intelligent animals. They are hard to decoy and very aware of their surroundings.

A goose is definitely less intelligent than a duck.

Maybe, but we see the black duck as a metaphor for how we run the company. We pay attention to details. We are aware of the larger enterprise IT and open source communities in which we operate. And we work smart. We have smart people. We capture information about open source software projects being worked on by other smart people. And we design our products and services for smart companies.

You have an extensive background in commercial software? RSA, HP, and Data General, don’t you?

That’s right.

So why open source? That’s a radical departure, isn’t it?

It was a natural move. We needed to collect data about open source projects for our business. We help development organizations manage and control open source and other externally sourced software, and, of course, we use open source in our products.

Lucene/Solr?

Yes. I think it is pretty clear that open source software is a huge source of innovation. We track more than 250,000 open source projects from thousands of Internet sites, and we maintain a free code search engine, Koders.com, which tens of thousands of developers use every day to search for open source software and other Web-downloadable code.

Koders contains billions of lines of code written in over 30 languages,  code that’s available under at least 28 different OSS software licenses. Koders is a powerful tool for developers looking for reusable open source code, methods, examples, algorithms and more. Code re-use makes sense – why re-invent the wheel? And it makes developers more efficient, and lets them focus on more the interesting aspects of software development, the innovation their companies need to succeed.

What have been the payoffs from this business decision?

Our involvement with open source has many benefits. We build better products, our time to market and innovation are improved, and we are more in tune with where the industry is moving – what types of software are leading innovation (mobile, health care information technology, and cloud solutions, for example.)

Any special challenges?

There are those who think Black Duck sells fear chiefly because our products help companies identify OSS in their code bases. What we really do is help our customers do the right thing. We believe we do more to help corporate users of OSS comply with the license obligations than any other organization. Most people want to do the right thing, and that’s not about fear, uncertainty, and doubt. It is being pragmatic. Our products ensure customers know what’s in their code, and automate the management of OSS use, so companies can ensure they are complying with the terms of the various open source software licenses in their code. We “design in” compliance making it easier for companies to innovate using open source.

You have a number of high profile clients? Are you able to characterize a typical use case?

Absolutely. How about search?

That would be great.

Search is really about delivering on intent, as one expert has observed. Koders.com is a perfect example of open source search in action. Say a developer is working on an application. He or she needs a compiler, but why write yet another compiler? It’s potentially a waste of time and a distraction away from more critical work. The developer’s intent is to find a compiler that meets a set of criteria. Koders delivers by searching for a complier and returning results that meet the developer’s needs.

Search also has to be fast, return relevant results, be scalable, easy to use, and have access to many sources of information, in many formats. Lucene was originally developed to search text; the Apache Solr project extends Lucene with faceted search; other OSS search engines search for images, crawl, parse, or search within the firewall (like Black Duck Code Sight.)

I want you to look in your crystal ball for a moment. What do you envision as future uses of open source search at your company going forward?

My crystal ball is usually blurry.

Mine doesn’t work at all. What’s ahead in your opinion?

We have just released Black Duck Code Sight, a search engine that indexes code across an organization’s source code repositories to enable fast search and navigation at enterprise scale. Code Sight (which is available as a free download at http://blackducksoftware.com/code-sight/download or as an enterprise-scale offering), improves developer productivity and software quality, supports standardization, and enhances compliance. Developers can find code quickly and leverage existing and approved components within their code bases whether they are internally written, open source, or come from other external sources.

We think the Black Duck Suite provides the most comprehensive search capability possible for developers. Inside the company they can search all internal repositories and, catalogs. Across the Internet, they can search all known OSS projects and code search sites, searching source code and project information (metadata).

Search Improves the development process, because developers find what they need fast – they can search the Internet or do an internal search to quickly locate a project or file that contains the desired code; they can debug faster, because search makes it easy to find the same code fragments; they can find security problems like “password=”; and search helps breaks down information silos. Version proliferation can be controlled, because you can find duplicate bugs in branched code.

In your opinion, what makes open source a viable option for the enterprise or development world?

Open source is about freedom to use the best code to solve the problem at hand, and the power that comes from having a large community of coders supporting you. For enterprises, OSS means increased development productivity and velocity, and frees developers up to work on projects that are differentiating and valuable. As part of a multi-source development model – where OSS is combined with other externally sourced code and home-grown code – it’s hard to beat on a cost, time-to-solution or innovation level.

One final question, if I may? Where can a reader get more information about your firm?

Check out http://blackducksoftware.com <http://blackducksoftware.com/> . Make sure to visit the Open Source Resource Center, look into Code Sight, and don’t forget the  Koders.com search site.

Stephen E Arnold, August 3, 2010

Freebie

Exclusive Interview: Eric Gries, Lucid Imagination

July 27, 2010

It’s not everyday that you find a revolutionary company like Lucid Imagination that’s blazing a new trail in the Open Source world whose CEO described the firm as being at 90 degrees to the traditional search business model.

Still, that’s the way that Eric Gries refers to Lucene/Solr’s impact on the search and content processing market. “The traditional search industry has not changed much in 30 years. Lucid Imagination’s approach is new, disruptive, and able to deliver high value solutions without the old baggage. We have flipped the old ideas of paying millions and maybe getting a solution that works. We provide the industrial strength software and then provide services that the client needs. The savings are substantial. Maybe we are now taking the right angle?”, he asked with a big smile?”

This pivot in the market reflects the destabilizing impact of open source search, and the business that Mr. Gries is building at supersonic speeds. “Traditional search is like taking a trip on a horse drawn cart. Lucid Imagination’s approach is quick, agile, and matched to today’s business needs.”

A seasoned executive in in software and information management, Mr. Gries uses the phrase to capture his firm’s meteoric rise in the Open Source world and how the success of its Open Source model is giving traditional competitors such as Autonomy, Endeca, and Microsoft Fast indigestion.

Mr. Gries’s background speaks of the right pedigree for a professional who is at the helm of a successful startup.

He got his start at Cullinet Software. “I started in the computer sciences and joined my first company as part of the development team,” Cullinet Software was an early leader–databases were young, and relational databases, made famous by Larry Ellison, were just getting out of the gate,” he recently told Beyond Search. After he got his MBA, he moved more into the business side, among other things, building the Network System Management Division at Compuware. He’s brought solid credentials in software services from his experience to the new venture at Lucid Imagination, a start up with substantial venture backing.

eric_head1

Eric Gries, the mastermind of the Lucene Revolution. Source: Lucid Imagination, www.lucidimagination.com

He was first attracted to search and data and the relevant issues there. The lure of Open Source came later.

“The thing that attracted me to Open Source at first was the fact that search was really growing in leaps and bounds,” he said and he’s understandably proud of what the company has been able to accomplish so far.

“Lucene/Solr is software that is as good or better than most of the other commercial offerings in terms of scalability, relevance and performance.”

He talked recently about how it was important to him to put together the right kind of advisory guidance, drawing on people with real world experience in the technology and business of Open Source.

“I was new to the space, so very early on I put together a very strong advisory board of Open Source luminaries that were very helpful.”

Lucid Imagination, of which Gries is President and CEO, was launched in 2009, and is only in its second year of operation. Lucid closed millions of dollars worth of business in their first year. The recipe for success that includes a deep level of involvement and collaboration with the community outside Lucid, communities and and ensuring the technology gets the right kind of attention in terms of vital needs like quality and flexibility that drive the appetite of organizations for search technology.

The value of the business is about search, not open source. The company is riding on the trends of search and Open Source which Mr. Gries says is being accepted more and more as a mainstay of the enterprise.

The establishment has taken notice as well with companies who understand the value of trailblazers like Red Hat — ‘opening doors’ for Lucene/Solr according to Mr. Gries — and in turn helping them to establish themselves as a second generation supplier of Open Source technology solutions.

Mr. Gries’s enthusiasm for his new type of business model is infectious and he enjoys pointing out the pride and dedication that goes into the work that gets done at Lucid, located in the heart of Silicon Valley.

“We added low cost to the metrics of scalability, relevance and performance so there’s really no good reason to use any commercial software with all due respect,” he added.

One of the more interesting aspects of Lucid is the fact that the firm has received $16 million in venture funding and is already getting an impressive list of clients on their roster that includes names like LinkedIn, Cisco, and Zappos, now a unit of the giant Amazon.

It’s clear that Mr. Gries has been able to understand that Open Source has been able to displace some commercial search solutions, and for him the reasons are simple that the software is downloaded at the blistering pace of thousands of units a day.

“Now the software is good and industry sees that there is a commercial entity committed to working with them, we want the enterprise to see they can work with Open Source,” he noted.

Still, while there is what’s been described as considerable momentum among some developers for this technology, some senior information technology managers and some purchasing professionals are less familiar with Open Source software and Lucene/Solr.

That’s where Mr. Gries understands the need to get the word out on the firm. He has learned that the education of the market is critical and hopes to build on the successes that Lucid Imagination achieved with sponsoring a developer conference in Prague earlier this year . It was so successful another—the Lucene Revolution— is planned for Boston in October.

Mr. Gries prides himself on the fact that the products created are all about a fresh business model with no distance between the developer and user. He’s proud of the success that the Lucene/Solr technology and community, along with his company, have enjoyed so far and likes to point out in his own way that one of their biggest goals beyond added value is increasing their market exposure for what he call this second generation Open Source.

“We are at 90 degrees to the typical search business model. We’re disruptive. We are making the competition explain a business model that is not matched to today’s financial realities. The handcuffs of traditional software licenses won’t fit companies that need agility and high value solutions,” he said. “ The software is already out there and running mission critical solutions. One of our tasks to to make sure people understand what is available now, and the payoffs available right now.”

He points to the fact that success came so early for Lucene/Solr the company has just put their 24/7 customer service in place.

Open source means leveraging a community. Lucid combines the benefits of open source software with exceptional support and service. For more information about the company, its Web site is at www.lucidimagination.com.

Stephen E Arnold, July 27, 2010

I have been promised a free admission to the Lucene Revolution in October 2010.

Exclusive Interview: Mike Horowitz, Fetch Technologies

July 20, 2010

Savvy content processing vendors have found business opportunities where others did not. One example is Fetch Technologies, based in El Segundo, California. The company was founded by professors at the University of Southern California’s Information Sciences Institute. Since the firm’s doors opened in the late 1990s, Fetch has developed a solid clientele and a reputation for cracking some of the most challenging problems in information processing. You can read an in-depth explanation of the Fetch system in the Search Wizards Speak’s interview with Mike Horowitz.

The Fetch solution uses artificial intelligence and machine learning to intelligently navigate and extract specific data from user specified Web sites. Users create “Web agents” that accurately and precisely extract specific data from Web pages. Fetch agents are unique in that they can navigate through form fields on Web sites, allowing access to data in the Deep Web, which search engines generally miss.

You can learn more about the company and its capabilities in an exclusive interview with Mike Horowitz, Fetch’s chief product officer. Mr. Horowitz joined Fetch after a stint at Googler.

In the lengthy discussion with Mr. Horowitz, he told me about the firm’s product line up:

Fetch currently offers Fetch Live Access as an enterprise software solution or as a fully hosted SaaS option. All of our clients have one thing in common, and that is their awareness of data opportunities on the Web. The Internet is a growing source of business-critical information, with data embedded in millions of different Web sites – product information and prices, people data, news, blogs, events, and more – being published each minute. Fetch technology allows organizations to access this dynamic data source by connecting directly to Web sites and extracting the precise data they need, turning Web sites into data sources.

The company’s systems and methods make use of proprietary numerical recipes. Licensees, however, can program the Fetch system using the firm’s innovative drag-and-drop programming tools. One of the interesting insights Mr. Horowitz gave me is that Fetch’s technology can be configured and deployed quickly. This agility is one reason why the firm has such a strong following in the business and military intelligence markets.

He said:

Fetch allows users to access the data they need for reports, mashups, competitive insight, whatever. The exponential growth of the Internet has produced a near-limitless set of raw and constantly changing data, on almost any subject, but the lack of consistent markup and data access has limited its availability and effectiveness. The rise of data APIs and the success of Google Maps has shown that there are is an insatiable appetite for the recombination and usage of this data, but we are only at the early stages of this trend.

The interview provides useful insights into Fetch and includes Mr. Horowitz’s views about the major trends in information retrieval for the last half of 2010 and early 2011.

Now, go Fetch.

Stephen E Arnold, July 20, 2010

Freebie. I wanted money, but Mr. Horowitz provided exclusive screen shots for my lectures at the Special Library Association lecture in June and then my briefings in Madrid for the Department of State. Sigh. No dough, but I learned a lot.

Lucene Revolution Preview: Otis Gospodnetic, Sematext

July 13, 2010

The Lucene Revolution Conference is shaping up. Among the presenters are open source developers representing a wide range of organizations. One of the speakers is Otis Gospodnetic, Sematext’s founder. Mr. Gospodnetic is also the author of Lucene in Action with co-authors Erik Hatcher and Michael McCandless. His firm implements open source search, natural language processing, and text analytics technology in the enterprise. His team focuses on the design and development of scalable, high-performance search and solutions.

I spoke with Mr. Gospodnetic earlier this week. Here are the highlights of our conversation:

Why are you interested in Lucene/Solr?

I’ve always been interested in information  gathering, information extraction, search, and related areas.  I’m think  that’s because I feel that information gathering, extraction, and  searching are precursors for gaining knowledge, and knowledge has always  been a hobby of mine. If I look back at all my professional experience,  everything I ever built had a strong search component.  This is why I  was happy when I stumbled upon Lucene around 2000 and why I immediately  joined the project, even before it was an Apache project, and why I’ve  been using Lucene ever since.

What is your take on the community aspect of Lucene/Solr?

Community around  Lucene and Solr is as real and as alive and active as it can be.  It’s  very knowledgeable and quick to help.  I’ve been a part of it for around  10 years now, and have witnessed the community grow, as well as its  knowledge breadth and depth increase.

When it comes to Lucene/Solr community,  the quote I like to give comes from the former Netflix search guy:

I posted,  went to get a sandwich, and came back to see two answers. The change  works, and I can get the fix into production today. This list is magic.

Both user and  development communities are so strong and active that it’s becoming  really hard for people to keep up with the volume of output these  communities produce.  Earlier this year we started publishing monthly  Lucene and Solr Digest blog posts.  These posts are for people who want  to keep up with (or keep an eye on) Lucene and Solr, but don’t have the  time to read some 60+ non-trivial-to-read email messages these  communities produce every day.  See http://blog.sematext.com/ or  http://twitter.com/sematext .  I hope we are not going through the  trouble of getting this published every month just because of some  mythical community!

Commercial companies are playing what I call the “open source card.”  Won’t that confuse people?

Judging from the demand, I’d say this is not  confusing to people.  On the contrary, I get the feeling they like the  open-source/commercial blend.  Plus, there is precedent – commercial  support for open-source software has been around for many years now:  MySQL, Red Hat have been doing this for years.  Not only is this not  confusing, it is welcomed.  Some people and organizations love and can  rely on the community support.  Others prefer paid support.  At Sematext  we do both – some of us participate on Lucene/Solr mailing lists  helping as much as we can via that channel.  We also publish the already  mentioned monthly Lucene and Solr Digest that summarize the new and  interesting developments from those two projects, and we offer paid tech  support and other types of services for Lucene, Solr, Hadoop, and other  related technologies.

What are the primary benefits of using  Lucene/Solr?

Let me highlight the points my work has driven home as pivotal.

First, there is the notion of TCO or total cost of ownership. TCO is *much* lower.  There are no license  fees, no
limitations about the index size, query rates, number of  servers, etc.

Second, Lucene/Solr offer flexibility. If you don’t like how something works in Lucene/Solr, you  can change it today and deploy it tomorrow.  If your use case is good,  the community will adopt it and you won’t have to maintain your  customized, forked Lucene/Solr version.

Third, quality. Lucene and Solr are mature.   They’ve been worked on by many smart people 24/7 around the world for  more than 10 years.  These people work on Lucene/Solr because that is  their passion, not because they are paid to do so, except for the lucky  few who also get paid to work on what they love.  Lucene and Solr can do  a lot – they have lots of features, they are reliable, they are still  being worked on and are improved on a daily basis.

And, finally, agility: You need  search?  You can have something working today.  You don’t have to go  through budget approvals, through long sales and negotiation cycles, you  don’t have to go through wine and dine dates that just create delays  that ultimately increase your costs.

When someone asks you why you don’t  use a commercial search solution, what do you tell them?

I tell them to wake  up.  It’s 2010.  There are alternatives.  Cheaper.  Faster.  Better.  I  tell them to read the answers to the previous questions.  When I see how  much some (all?) of the commercial search solutions cost and I compare  that to what we at Sematext can do for a customer for that sort of  money…  I recently happened to see a quote from one well-known  commercial search vendor and my jaw dropped.  Well, not really, because I  know they charge an arm and the leg for their software, but when you  think about how many kids you can put through college for that kind of  money.

Let me also quote  something that came up recently in a thread titled “Arguments in Favor  of Lucene over Commercial Competition”.

In my initial foray  into Lucene several years ago, by the time I’d sent a support request to  the vendor of a commercial product and received an answer telling me  that I hadn’t included the
correct license info and I’d have to provide  it before they could talk to me, I’d found Lucene, downloaded it,  indexed some of our data and run searches against it. Not to mention that rather than  waiting for days to get a response from the commercial vendor, my  questions on the Lucene user’s list were answered within a very few  hours.  With grace and tolerance for my ignorance.

How do people reach you?

Sematext is at  http://sematext.com/ and that is the best way to reach the professional  me. Our  blog and the Digest posts mentioned earlier are at http://blog.sematext.com/ . We are also at http://twitter.com/sematext if  you prefer us in 140 char bites.

Will you elaborate on these points in your Lucene Revolution lecture?

Absolutely. Looking forward to the conference and hearing the great speakers. I understand Cisco is giving a talk too.

Stephen E Arnold, July 13, 2010

Post sponsored by Lucid Imagination and the Lucene Revolution Conference.

Podcast Interview with Paul Doscher, Part 2: The Exalead Technology

June 21, 2010

Exalead’s Paul Doscher talks about Exalead’s technology on the June 21, 2010, ArnoldIT Beyond Search podcast. Exalead has been growing rapidly, landing blue-chip accounts with the largest technology company in North America, the French postal service, and Canada’s Urbanizer.com. In this podcast, Mr. Doscher talks about Exalead’s technical approach to content processing and the framework that makes search-based applications crack tough problems in information access. You can listen to the podcast on the ArnoldIT.com Web site. More information about Exalead is available from www.exalead.com. The ArnoldIT podcast series extends the Search Wizards Speak series of interview beyond text into rich media. Watch this blog for announcements about other rich media programs from the professionals who move information retrieval beyond search.

Stephen E Arnold, June 21, 2010

Sponsored by Stephen E. Arnold

Exclusive Interview with Seth Grimes, Alta Plana Now Available

June 9, 2010

In April 2010, I spoke with Seth Grimes, the founder of Alta Plana. Mr. Grimes is an analytics strategy consultant. He is founding chair of the Sentiment Analysis Symposium and of the Text Analytics Summit, contributing editor at Intelligent Enterprise magazine, and text analytics channel expert at the Business Intelligence Network. He founded Washington DC-based Alta Plana Corporation in 1997. Mr. Grimes consults, writes, and speaks on information-systems strategy, data management and analysis systems, industry trends, and emerging analytical technologies.

In the interview, he highlighted one of the challenges search and content processing systems face. He said:

I’ve in the past characterized search as evidence of a failure of design. If information were correctly and adequately categorized and organized and made accessible, we wouldn’t need search, would we?  I’ve retreated from that view as I’ve seen search evolve into information access, into technology that not only finds but also organizes results from sources the user likely-as-not didn’t know about.  Yet I’d call my statement still largely true when it comes to the enterprise’s own data holdings: Search is necessitated by a failure of design.  Do a better job organizing information as it’s created or acquired, and also, by the way, stop allowing application vendors to bring in siloed search applications, and the in-organization situation will improve.

To read the full interview, navigate to Search Wizards Speak and click on the Seth Grimes’ interview or click this link.

Stephen E Arnold, June 9, 2010

Unsponsored post.

The Lucene Revolution: An Interview with Michael Bohlig of Lucid Imagination

June 3, 2010

Editor’s Note: Lucid Imagination held a Lucene / Solr conference in Prague in May 2010. More than 150 developers and business professionals attended the event. Beyond Search spoke with Michael Bohlig, a senior Lucid Imagination executive about the conference, open source search, and the company’s plans to expand the “Lucene Revolution.”

What was the impetus for the Lucene Solr conference in Prague?

A number of factors led us to host the conference. We were seeing a big increase in interest in Lucene/Solr in Europe. As in other parts of the world, European companies have been looking for alternatives to expensive and inflexible proprietary legacy search solutions. Europe has been a driving force in IT innovation and open source adoption. Finally, there is a large and active Lucene/Solr community in Europe and we wanted to help provide a venue to bring together. developers, users, and the ecosystem.

Can you give me some background for the conference? What was the focus of the event?

We offered two days of in-depth training, followed by a two day conference with a variety of keynotes, technical sessions, tutorials, and lighting talks. The conference was run on a not-for-profit basis with net proceeds contributed to the Apache Software Foundation. The focus for the conference was the disruptive power of open source enterprise search, and how companies are using Lucene/Solr search to develop new innovative ways to tap into big, and increasingly diverse, data.

How many people attended?

More than 160 people attended from all over the world – we had attendees from almost ever country in Europe plus Japan and the US.

What were the topics that generated the most buzz at the conference?

Stephen Dunn of The Guardian did a great talk on how they are using Solr in their Open Platform content sharing service – and how they are working with Lucid Imagination to deploy this new way to unlock their content database and create an innovative business model. Zack Urlocker spoke about the disruptive power of open source and how it is changing the entire enterprise software space – including enterprise search. He used his experience with MySQL as a model, but also looked at a variety of other cases and markets as well, showing how big incumbent vendors get bogged down and find it difficult to innovate, while new open source players can change the game and come up with innovative new solutions and go after under-served markets. He pointed out that the next wave of Web, cloud, and SaaS players have been based on open source, and that in the future making sense of big data will be a killer app. “Solr In The Cloud” by Mark Miller also generated a lot of buzz, describing how Solr’s current and future features will ease the deployment of Solr into large scale, distributed environments at massive scale.

image

To submit a proposal for a talk, click the graphic.

This sounds like Kool Aid drinking. What were the substantive business payoffs from the use of open source search technology?

Glad to. One good example is The Guardian harnessing content in a new way to re-invent their business model, generate new sources of revenue, and challenge other newspaper and media companies with an open approach. Another is Nordjyske Medier – a Danish media company – that got a big payoff from Solr when their online Yellow and White Pages failed with Google Search Appliance and Microsoft SQL Server.

Can you give me a use case where open source search has bested a commercial search solution such as Autonomy’s or Endeca’s?

There are many examples. Just at the conference we heard from Nordjyske Medier moving off of GSA.

The Guardian was formerly an Endeca user. Also, in one of the sessions at the conference (all are posted at: http://lucene-eurocon.org/agenda.html by the way), Jan Høydahl from Cominvent talked about how to migrate from Microsoft FAST to Solr – and indicated he personally already moved four customers off of FAST to Solr. Of course there are many more cases in the US. One example is Dollar Days, a discount e-tailer which moved off of Endeca to Solr. You can check out the case study on our Web site.

Read more

Inside Search: Raymond Bentinck of Exalead, Part 2

February 4, 2010

This is the second part of the interview with Raymond Bentinck of Exalead.

Isn’t this bad marketing?

No. This makes business sense.Traditional search vendors who may claim to have thousands of customers tend to use only a handful of well managed references. This is a direct result of customers choosing technology based on these overblown marketing claims and these claims then driving requirements that the vendor’s consultants struggle to deliver. The customer who is then far from happy with the results, doesn’t do reference calls and ultimately becomes disillusioned with search in general or with the vendor specifically. Either way, they end up moving to an alternative.

I see this all the time with our clients that have replaced their legacy search solution with Exalead. When we started, we were met with much skepticism from clients that we could answer their information retrieval problems. It was only after doing Proof of Concepts and delivering the solutions that they became convinced. Now that our reputation has grown organizations realize that we do not make unsubstantiated claims and do stick by our promises.

What about the shift to hybrid solutions? An appliance or an on premises server, then a cloud component, and maybe some  fairy dust thrown in to handle the security issues?

There is a major change that is happening within Information Technology at the moment driven primarily by the demands placed on IT by the business. Businesses want to vastly reduce the operational cost models of IT provision while pushing IT to be far more agile in their support of the business. Against this backdrop, information volumes continue to grow exponentially.

The push towards areas such as virtual servers and cloud computing are aspects of reducing the operational cost models of information technology provision. It is fundamental that software solutions can operate in these environments. It is surprising, however, to find that many traditional search vendors solutions do not even work in a virtual server environment.

Isn’t this approach going to add costs to an Exalead installation?

No, because another aspect of this is that software solutions need to be designed to make the best use of available hardware resources. When Exalead provided a solution to the leading classified ads site Fish4.co.uk, unlike the legacy search solution we replaced, not only were we able to deploy a solution that met and exceeded their requirements but we reduced the cost of search to the business by 250 percent. A large part of this was around the massively reduced hardware costs associated with the solution.

What about making changes and responding quickly? Many search vendors simply impose a six month or nine month cycle on a deployment. The client wants to move quickly, but the vendor cannot work quickly.

Agility is another key factor. In the past, an organization may implement a data warehouse. This would take around 12 to 18 months to deploy and would cost a huge amount in hardware, software and consultancy fees. As part of the deployment the consultants needed to second guess the questions the business would want to ask of the data warehouse and design these into the system. After the 12 to 18 months, the business would start using the data warehouse and then find out they needed to ask different types of questions than were designed into the system. The data warehouse would then go through a phase of redevelopment which would last many more months. The business would evolve… making more changes and the cycle would go on and on.

With Exalead, we are able to deploy the same solution in a couple months but significantly there is no need to second guess the questions that the business would want to ask and design them into the system.

This is the sort of agile solution that businesses have been pushing their IT departments to deliver for years. Businesses that do not provide agile IT solutions will fall behind their competitors and be unable to react quickly enough when the market changes.

One of the large UK search vendors has dozens of niche versions of its product. How can that company keep each of these specialty products up to date and working? Integration is often the big problem, is it not?

The founders of Exalead took two years before starting the company to research what worked in search and why the existing search vendors products were so complex. This research led them to understand that the search products that were on the marketplace at the time all started as quite simple products designed to work on relatively low volumes of information and with very limited functional capabilities. Over the years, new functionality has been added to the solutions to keep abreast of what competitors have offered but because of how the products were originally engineered they have not been clean integrations. They did not start out with this intention but search has evolved in ways never imagined at the time these solutions were originally engineered.

Wasn’t one of the key architects part of the famous AltaVista.com team?

Yes. In fact, both of the founders of Exalead were from this team.

What kind of issues occur with these overly complex products?

As you know, this has caused many issues for both vendors and clients. Changes in one part of the solution can cause unwanted side effects in another part. Trying to track down issues and bugs can take a huge amount of time and expense. This is a major factor as to why we see the legacy search products on the market today that are complex, expensive and take many months if not years to deploy even for simple requirements.

Exalead learned from these lessons when engineering our solution. We have an architecture that is fully object-orientated at the core and follows an SOA architecture. It means that we can swap in and out new modules without messy integrations. We can also take core modules such as connectors to repositories and instead of having to re-write them to meet specific requirements we can override various capabilities in the classes. This means that the majority of the code that has gone through our quality-management systems remains the same. If an issue is identified in the code, it is a simple task to locate the problem and this issue is isolated in one area of the code base. In the past, vendors have had to rewrite core components like connectors to meet customers’ requirements and this has caused huge quality and support issues for both the customer and the vendor.

What about integration? That’s a killer for many vendors in my experience.

The added advantage of this core engineering work means that for Exalead integration is a simple task. For example, building new secure connectors to new repositories can be performed in weeks rather than months. Our engineers can take this time saved to spend on adding new and innovative capabilities into the solution rather than spending time worrying about how to integrate a new function without affecting the 1001 other overlaying functions.

Without this model, legacy vendors have to continually provide point-solutions to problems that tend to be customer-specific leading to a very expensive support headache as core engineering changes take too long and are too hard to deploy.

I heard about a large firm in the US that has invested significant sums in retooling Lucene. The solution has been described on the firm’s Web site, but I don’t see how that engineering cost is offset by the time to market that the fix required. Do you see open source as a problem or a solution?

I do not wake up in the middle of the night worrying about Lucene if that is what you are thinking! I see Lucene in places that have typically large engineering teams to protect or by consultants more interested in making lots of fees through its complex integration. Neither of which adds value to the company in, for example, reducing costs of increasing revenue.

Organizations that are interested in providing cost effective richly functional solutions are in increasing numbers choosing solutions like Exalead. For example, The University of Sunderland wanted to replace their Google Search Appliance with a richer, more functional search tool. They looked at the marketplace and chose Exalead for searching their external site, their internal document repositories plus providing business intelligence solutions over their database applications such as student attendance records. The search on their website was developed in a single day including the integration to their existing user interface and the faceted navigation capabilities. This represented not only an exceptionally quick implementation, far in excess of any other solution on the marketplace today but it also delivered for them the lowest total cost of ownership compared to other vendors and of course open-source.

In my opinion, Lucene and other open-source offerings can offer a solution for some organizations but many jump on this bandwagon without fully appreciating the differences between the open source solution and the commercially available solutions either in terms of capability or total cost. It is assumed, wrongly in many instances, that the total cost of ownership for open source must be lower than the commercially available solutions. I would suggest that all too often, open source search is adopted by those who believe the consultants who say that search is a simple commodity problem.

What about the commercial enterprise that has had several search systems and none of them capable of delivering satisfactory solutions? What’s the cause of this? The vendors? The client’s approach?

I think the problem lies more with the vendors of the legacy search solutions than with the clients. Vendors have believed their own marketing messages and when customers are unsatisfied with the results have tended to blame the customers not understanding how to deploy the product correctly or in some cases, the third-party or system integrator responsible for the deployment.

One client of ours told me recently that with our solution they were able to deliver in a couple months what they failed to do with another leading search solution for seven years. This is pretty much the experience of every customer where we have replaced an existing search solution. In fact, every organization that I have worked with that has performed an in-depth analysis and comparison of our technology against any search solution has chosen Exalead.

In many ways, I see our solution as not only delivering on our promises but also delivering on the marketing messages that our competitors have been promoting for years but failing to deliver in reality.

So where does Exalead fit? The last demo I received showed me search working within a very large, global business process. The information just appeared? Is this where search is heading?

In the year 2000, and every year since, a CEO of one of the leading legacy search vendors made a claim that every major organization would be using their brand of meaning based search technology within two years.

I will not be as bold as him but it is my belief that in less than five years time the majority of organizations will be using search based applications in mission critical applications.

For too long software vendors have been trying to convince organizations, for example, that it was not possible to deploy mission critical solutions such as customer 360 degree customer view, Master Data Management, Data Warehousing or business intelligence solutions in a couple months, with no user training, with with up-to-the-minute information, with user friendly interfaces, with a low cost per query covering millions or billions of records of information.

With Exalead this is possible and we have proven it in some of the world’s largest companies.

How does this change the present understanding of search, which in my opinion is often quite shallow?

Two things are required to change the status quo.

Firstly, a disruptive technology is required that can deliver on these requirements and secondly businesses need to demand new methods of meeting ever greater business requirements on information.

Today I see both these things in place. Exalead has proven that our solutions can meet the most demanding of mission critical requirements in an agile way and now IT departments are realizing that they cannot support their businesses moving forward by using traditional technologies.

What do you see as the trends in enterprise search for 2010?

Last year was a turning point around Search Based Applications. With the world-wide economy in recession, many companies have put projects on hold until things were looking better. With economies still looking rather weak but projects not being able to be left on ice for ever, they are starting to question the value of utilizing expensive, time consuming and rigid technologies to deliver these projects.

Search is a game changing technology that can deliver more innovative, agile and cheaper solutions than using traditional technologies. Exalead is there to deliver on this promise.

Search, a commodity solution? No.

Editor’s note: You can learn more about Exalead’s search enable applications technology and method at the Exalead Web site.

Stephen E Arnold, February 4, 2010

I wrote this post without any compensation. However, Mr. Bentinck, who lives in a far off land, offered to buy me haggis, and I refused this tasty bribe. Ah, lungs! I will report the lack of payment to the National Institutes of Health, an outfit concerned about alveoli.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta