Hlava Interview: New Medical Integrity Services

September 15, 2011

On September 14, 2011, I interviewed Margie Hlava, one of the world’s leading experts in the fields of content indexing, taxonomies, and controlled vocabularies. In our third interview, I probed Ms. Hlava, the founder of Access Innovations, one of the key service and software firms in the information retrieval sector. She was attracted to indexing as a consequence of her undergraduate work in the field of biology and its classification methodology.

In this podcast, Ms. Hlava discusses:

  • How automated systems like MAI and other Access Innovations’ systems detect new terminology, entities, and synonyms
  • The challenge of high volume indexing and tagging of medical documents related to billing. The indexing makes it easier to identify improper or erroneous payments as part of the US government’s “medical integrity programs”
  • The close relationship between indexing and content processing cost control. Flawed indexing contributes to search inefficiency. Governance projects, which are really thinly disguised reworking of indexing and tagging, often cost significantly more than a process that makes indexing part of the pre-planning and pre-launch work plan.

The audio program is available without charge on the ArnoldIT.com Web site at this link. The program was recorded on September 14, 2011. More information about Access Innovations is available at the firm’s Web site. Be sure to take a look at Taxodiary, the company’s highly regarded Web log about indexing and taxonomy related topics.

Stephen E Arnold, September 15, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

PolySpot Secures Investment

September 15, 2011

Polyspot, based in Paris, is a specialist enterprise search solution vendor.  I learned of the two million euro infusion of capital in the announcement “PolySpot Raises €2 Million from Newfund to Boost its Growth.” The write up said:

PolySpot’s enterprise search solutions are the result of over 10 years of research and development, featuring a highly innovative architecture and technologies (including Apache/Lucene indexing library and Apache/Solr search service). As a result, PolySpot is able to offer open, functionally rich and highly competitive solutions that are used by major accounts across all sectors, including BNP-Paribas, Crédit Agricole SA, Bureau-Veritas, Veolia, Vinci or Schlumberger. Building on this success, PolySpot is now using this fund-raising round to move to the next level and boost its international business in the coming months. This fund-raising is part of an intensive work that began with the arrival of Gilles André (the figure behind success stories such as Leonard’s Logic and Augure) as the company’s new Managing Director and the appointment of David Fischer as head of R&D in 2010.

If you are not familiar with the firm, Polyspot is pushing forward with integrated information solutions. Founded in 2001, PolySpot designs and sells search and information access solutions designed to improve business efficiency in an environment where data volumes are increasing at an exponential rate. PolySpot’s solutions offer built in connectivity to most file types and file systems. The firm’s solutions are based on an innovative infrastructure offering both versatility and high performance at a competitive price point. The firm’s solutions are in use at such organizations as Allianz, BNP Paribas, Bureau Veritas, Crédit Agricole, OSEO, Schlumberger, Veolia, Trinity Mirror and Vinci.

For more information, navigate to the firm’s Web site at www.polyspot.com. You can get additional information about the firm’s capabilities in this interview with Olivier Lefassy,  one of the firm’s senior executives.

Stephen E Arnold, September 15, 2011

Sponsored by Pandia.com

Open Source Faceted Search Solutions

September 12, 2011

There is little doubt that if you are handy with the bits and bytes, you can get quite a mileage boost from open source search technology.

Stephanie Lemieux shares her experience in “Open-Source Options for Faceted Search for the Budget Conscious User” at CMS Wire. Ms Lemieux was helping an organization improve their online document search, but the group couldn’t afford any of the big name faceted search solutions. Her research into alternatives produced some open source options.

First on the list is Apache Solr. Its simple faceting toolkit is good only for basic searches, Lemieux says, but can be expanded by a “savvy” IT department or consultants.

Sphinx is the next entry, which she points out is well suited for indexing database content, scales well, and boasts real-time indexing. However, the facet support takes some effort to implement.

Then there’s Drupal, which is a content management system. Lemieux recommends this one only if you’re starting a website from scratch, of course. A search API module enables faceting and can be applied to various backend engines, including the aforementioned Solr.

Lemieux cautions:

Keep in mind that open-source does not automatically equal easy or cheap: if you have complex requirements and end up hiring integrators to implement your solution, you still might end up with a somewhat pricey project. But open-source solutions do have large communities behind them, so development can be faster and less expensive than vendor professional services.

She closes with an emphatic reminder that faceted search is only as good as the taxonomy and metadata that supports it. That’s a point that’s tempting to overlook, but crucial. Keep in mind that if your personal open source search guru heads for greener commercial pastures, you will have to pay to get the expertise to keep your open source search system singing a happy tune.

Cynthia Murrell September 12, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

OpenText: Search to Teaching Is Not the Deal about Selling Services?

September 8, 2011

Another data management, search, collaboration vendor does the “we are in a new business” quick step. Searching with the Stars could be a TV sensation because there are more twists, dips, and slides in the search and content ballroom than in an Arthur Murray Advanced Beginners’ class.

Navigate to “Open Text acquires Peterborough’s Operitel”. The news is that one Canadian firm snapped up another Canadian outfit. What makes this interesting is that I was able to see some weak-force synergy between Nstein (sort of indexing and sort of data management) and OpenText, owner of lots of search, content processing, and collaboration stuff plus an SGML database and the BASIS system. But the Operitel buy has me doing some speculative thinking.

Here’s the passage which caught my attention:

Operitel’s flagship LearnFlex product is built on Microsoft Corp.’s .NET platform and is a top tier e-learning reseller for the Windows maker. Open Text also has a long standing partnership with Redmond, Wash.-based Microsoft.

I see more Microsoft credibility and a different way to sell services. OpenText strikes me as a company with a loosely or mostly non integrated line up of products. The future looks to be charging into the SharePoint sector, riding a horse called “eLearning.”

In today’s business climate, organic growth seems to be tough to achieve even with RedDot and a fruit basket filled with other technologies. (What happened to OpenText’s collaboration product? What happened to the legal workflow business? I just don’t know.) So how does a company which some Canadians at Industry Canada see as one of the country’s most important software companies grow? Here’s the answer:

Open Text’s growth-by-acquisition strategy has recently won accolades among the analyst community. The company purchased Maryland-based Metastorm Inc. for US$182-million, Texas-based Global 360 Holding Corp. for US$260-million and U.K.-based WeComm Ltd. for an undisclosed amount all in the past six months.

My hunch is that OpenText may want to find a buyer. Acquisitions seem to be a heck of a lot easier to complete than landing a major new account. I am not the only person thinking that the business of OpenText is cashing out. Point your browser at “Amid Takeover Fever, Open Text Looks Like a Bargain.” Here’s a key point in my opinion:

Open Text shares have climbed about 20 per cent this year, an increase that would pale in comparison to what would happen if a potential buyer emerged offering a premium similar to what HP has given Autonomy.

So we see a big payday for Autonomy has triggered a sympathetic response at the Globe & Mail, among “analysts”, and I am pretty sure among some OpenText stakeholders.

Several observations:

First, bankers think mostly about their commissions and fees. Bankers don’t think so much about other aspects of a deal. If there is a buck to be made from a company with a burlap sack of individual, solutions, and services, the bankers will go for it. Owning a new Porsche takes the edge off the winter.

Second, competitors have learned that other companies are a far greater threat than OpenText. A services firm can snag some revenue, but other vendors have been winning the big deals. The OpenText strategy has not generated the top line revenue growth and profit that a handful of other companies in search and content processing have achieved. So the roll up and services play looks like a way to add some zip to the burlap bag’s contents.

Third,  customers have learned that OpenText does not move with the agility of some other firms. I would not use the word “glacial,” but “stately” seems appropriate. If you know someone with the RedDot system, you may be able to relate to the notion of rapid bug fixes and point releases. By the way, RedDot used to install an Autonomy stub as the default search engine. I find this interesting because OpenText owns BRS search, Fulcrum (yikes!), and the original Tim Bray SGML data management and search system. (Has SGML and XML come and gone?)

I am not willing to go out on a limb about a potential sale of OpenText, but I think that the notion of eLearning is interesting. Will OpenText shift its focus back to collaboration and document management much as Coveo flipped from search to mobile search to customer support and then back to search again. Canadian search and content processing vendors are interesting. Strike up the music. Another fast dance is beginning. Grab your partner. Search to services up next.

Stephen E Arnold, September 9, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

When Social and Search Meet in the Enterprise

September 8, 2011

Organizations are embracing Microsoft SharePoint as a platform for collaboration and other social online messaging. “If You Must Have In-House Social Tools, Go with SharePoint” is representative of the flood of information about SharePoint’s utility for collaborative activities.

J. Peter Bruzzese said:

he good news, at least from the SharePoint perspective, is that you have a tremendous amount of control over the amount of information people can share. For example, by deploying the User Profile Service Application in a SharePoint server farm, you can deploy My Sites and My Profile options to your users. They can then enter their own profile information, upload images of themselves for a profile picture, create a personal page with a document library (both personal and shared), tag other people’s sites and information, and search for people within the organization based on their profiles. The SharePoint administrator can control the extent to which the sharing occurs. You can adjust the properties in the profile page, turning options on or off and adding new properties if needed. You can turn off the I Like It and Tags & Notes features, and you can even delete tags or notes your corporate policy disapproves of. You can access profile information and make changes if needed. And you don’t have to turn on My Sites or let people create their own blog and so on: It’s not an all-or-nothing situation with these tools (ditto with third-party tools).

The excellent write up does a good job of explaining SharePoint from a high level.

There are three points which one wants to keep in mind:

First, collaborative content puts additional emphasis on managing the content generated by the users of social components within SharePoint. In most cases, short message are not an issue. What is important, however, is capturing as much information about the information as possible. One cannot rely on users to provide context for some comments. Not surprisingly, additional work is needed to ensure that social messages have sufficient context to make the information in a short message meaningful to a person who may be reviewing a number of documents of greater length. To implement this type of feature, a SharePoint licensee will want to have access to systems, methods, and experts familiar with context enhancement, not just key word indexing.

Second, the social content is often free flowing. The engineering for a “plain vanilla” SharePoint is often sufficiently robust to handle typical office documents. However, if a high volume flow of social content is produced within SharePoint, “plain vanilla” implementations may exhibit some slow downs. Again, throwing hardware at a problem may work in certain situations but often additional modifications to SharePoint may be required to deliver the performance users expect. Searching for a social message with a key fact can be frustrating if the system imposes high latency.

Finally, social content is assumed to be a combination of real time back and forth as well as asynchronous. A person may see a posting or a document and then replay an hour or a day later. Adding metadata and servers will not address the challenge of processing social content in a timely manner. Firms with specific expertise in search and content processing can help. The approach to bottleneck issues in indexing, for example, rely on the experience of the engineer, not an FAQ from Microsoft or blog post from a SharePoint specialist.

If you want to optimize your SharePoint system for social content and make that content findable, take a look at the services available from Search Technologies. We have deep experience with the full range of SharePoint search solutions, including Fast Search.

Iain Fletcher, September 8, 2011

Sponsored by Search Technologies

SharePoint: Embracing Social Functions and Features

September 7, 2011

The future of search is a subject that sparks a conversational camp fire. After email, search is one of the principal uses of online systems. In the last year, traditional key word search has been altered by the growing demand for “social content.” The idea is not just to index online discussions, but to use the signals these conversations emit as a way to improve the relevance of a search.

For example, when Lady Gaga sends her fans a Twitter message, the response and diffusion of that message provides useful information to a search system. A query about a fashion trend sent to Bing and Google, for example, will “respond” to the Lady Gaga message and include the retweets of her content as an indication of relevance.

This could apply to enterprise search. It could be possible to configure a mainstream solution such as Microsoft Fast Search Server to respond to social content.

A solid overview of what is possible is available in the InfoWorld article, “If You Must Have In-House Social Tools, Go With SharePoint.”  Examples of SharePoint’s social tools are support for Weblogs, the “I Like It” tags, notes, and profiles pages. InfoWorld explains how these tools will contribute to user satisfaction and help enhance the findability of content within an enterprise SharePoint installation. The implementation of social functions falls upon SharePoint administrators. Coincident with the release of the social tools, InfoWorld points out that user training is helpful. The article makes this important point:

I’m not a fan of social networking tools at work. I believe it distracts people more than it provides value. Call me a dinosaur, but when I want to say something important to the entire company, I use this ancient system called email. Maybe I’m not a team player because I don’t like collaborating on documents; if I need your help on a document, I’ll email it to you and you can look it over.

 

My view is that social networking has a time and a place, is beneficial, and should be taken in small quantities.

Enjoy Maximum Collaboration with the Help of SharePoint” is especially thought provoking. The author said:

What SharePoint applications do is the customization, configuration and the development of Intranet, Extranet and the portals of information that are present on SharePoint.

My thought is that SharePoint does not perform customization. SharePoint must be configured and tuned to deliver certain types of functions. In our experience, SharePoint requires additional scripts. The default services deliver access to document libraries to manage content, generate reports, locate services, and share content across a wide network. However, social features may warrant changes to the SharePoint infrastructure to ensure that content throughout performance is not compromised and make certain indexing processes receive additional tuning to handle the social content if needed. Due to the abbreviated form of some social content, additional metadata may be required to enhance the findability of a short message.

Search Technologies has implemented social functions into Microsoft SharePoint. The Search Technologies’ team has the experience to derive the maximum benefit from the services which Microsoft includes with SharePoint. In addition, our engineers can implement special features as well as install, configure, and tune third party add-ins from Microsoft certified software developers.

Social has arrived and SharePoint is the ideal platform to use to take advantage of this fast growing content type.

Iain Fletcher, September 7, 2011

The Governance Air Craft Carrier: Too Big to Sail?

August 31, 2011

In a few days, I disappear into the wilds of a far off land. In theory, a government will pay me, but I am increasingly doubtful of promises made from 3,000 miles from Harrod’s Creek. As part of the run up to my departure, we held a mini webinar/consultation on Tuesday, August 30, 2011, with a particularly energetic company engaged in “governance.” (SharePoint Semantics has dozens of articles about governance. One example is “A Useful Guide to SharePoint Success from Symon Garfield”. The format of the call was basic. The people on the call asked me questions, and I provided only the perspective of three score years and as many online failures can provide. (I will mention SharePoint but my observations apply to other systems as well; for instance, Documentum, Interwoven, FileNet, etc.)

What I want to do in this short write up is identify a subject that we did not tackle directly in that call, which concerned a government project. However, after the call, I realized that what I call an “air craft carrier” problem was germane to the discussion of automated indexing and entity extraction. An air craft carrier today is a modular construction. The idea is that the flight deck is made by one or more vendors, moved to the assembly point, and bolted down. The same approach is taken with cabins, electronics, and weapon systems.

The basic naval engineering best practice is to figure out how to get the design nailed down. Who wants to have propeller assemblies arrive that do not match the hull clearance specification?

What’s an air craft carrier problem? An air craft carrier is a big ship. It is, according to my colleague Rick Fiust, a former naval officer, a “really big ship.” Unlike a rich person’s yacht or a cruise ship, an air craft carrier does more than surprise with its size. Air craft carriers pack a wallop. In grade school I remember learning the phrase “gun boat diplomacy.” The idea was that a couple of gun boats sends a powerful message.

image

What every content centric system aspires to be. Some information technology professionals will tell their bosses or clients, “You have a state of the art search and content processing system. Everything works.” Unlikely in my experience.

Governance or what I like to think of as “editorial policy” is an air craft carrier. The connotation of governance is broad, involves many different functions, and sends a powerful message. The problem is that when content in an organization becomes unmanageable, the air craft carrier runs aground and the crew is not exactly sure what to to about the problem.

Consider this real life example. A well meaning information technology manager installs SharePoint to allow the professionals in marketing to share their documents, price lists, and snippets from a Web site. Then the company acquires another firm, which runs SharePoint as well as a handful of enterprise applications. On the surface, the situation looks straight forward. However, the task of getting the two organizations’ systems to work smoothly is a bit tricky. There are the standard challenges of permissions and access as well as somewhat more exotic ones of coping with intra-unit indexing and index refreshes. Then a third company is acquired, and it runs SharePoint. Unlike the first two installations which were “by the book”, the third company’s information technology unit used SharePoint as a blank canvas and created specialized features and services, plugged in third party components, and some home grown code.

Now the content issue arises. What content is available, when, to whom, and under what circumstances. Because the SharePoint installation was built in separate modules over time, will these fit together? Nope. There was no equivalent of the naval engineering best practice.

Governance, in my opinion, is the buzz word slapped on content centric systems of which SharePoint is but one example. The same governance problem surfaces when multiple content centric systems are joined.

Will after the fact governance solve the content problems in a SharePoint or other content centric environment? In my experience, the answer is, “Unlikely.” There are four reasons:

Cost. Reworking three systems built on the same platform should be trivial. The work is difficult and in some situations, scrapping the original three systems and starting over may be a more cost effective solution. Who knows what interdependencies lurk within the three systems which are supposed to work as one? Open ended engineering projects are likely to encounter funding problems, and the systems must be used “as is” or fixed a problem at a time.

Read more

Apache Lucene 4.0 Changes Revealed

August 30, 2011

We prepared a report for a search vendor last week and reported that in our sample of organizations, more than 12 percent reported using open source software. Compared to three years ago, that’s a significant jump. Open source, despite the machinations of some large out fits, continues to make in roads in certain organizations. We learned that when there are strong advocates of open source working at an organization, there is a correlation between access to expertise and and internal cheerleader and the appetite for open source solutions.

Curious about the upcoming Apache Lucene 4.0? Ostatic gives us this “Guest Post: Under the Hood in Apache Lucene 4.0,” in which Lucene insider Simon Willnauer details a few big changes.

The decision to let go of backward compatibility allows for significant advances. For one, in the search engine library, indexing text strings are replaced with UTF8 bytes. This revision increases efficiency in term dictionary loading, memory usage, and search speeds. The change also allows for the much anticipated “flexible indexing.” Willnauer explains:

Optimized codecs can be loaded to suit the indexing of individual datasets or even individual fields. . . . New indexing codecs can be developed and existing ones updated without the need for hard-coding within Lucene. There is no longer any need for project-level compromise on the best general-purpose index formats and data structures.

Next, multiple threads will now be used for indexing. This shift makes better use of multi-core processing and input/output resources. Then there’s “concurrent flushing,” where each thread buffer can flush its memory separately without interfering with other users. Finally, a painstakingly revised Levenshtein Automation algorithm greatly improves fuzzy matching.

According to Willnauer, these tidbits are just the beginning. We agree, but the involvement of legal eagles could destabilize the open source band wagon.

Cynthia Murrell August 30, 2011

Sponsored by Pandia.com

Interview with John Steinhauer, Search Technologies

August 29, 2011

Search Technologies Corp., a privately-held firm, continues to widen its lead as the premier enterprise search consulting and engineering services firm. Founded six years ago, the company has grown rapidly. The firms dozens of engineers offer clients deep experience in Microsoft (SharePoint and Fast), Lucene/Solr, Google Search Appliances, and Autonomy systems, among others. Another factor that sets Search Technologies apart is that the company is profitable and debt-free, and its business continues to grow at 20 percent or more each year. It is privately held and headquartered in Herndon, VA.

John-Steinhauer

John Steinhauer, vice president of technology, Search Technologies

John Steinhauer

On August 8, I spoke with John Steinhauer,  vice president of technology of Search Technologies. Before joining Search Technologies, Mr. Steinhauer was the director of product management at Convera. He attended Boston University and the University of Chicago. At Search Technologies, Mr. Steinhauer is Responsible for the day-to-day direction of all technical and customer delivery operations. He manages a growing team of more than 75 engineers and project managers. Mr. Steinhauer is one of the most experienced project directors in the enterprise search space, having been involved with hundreds of sophisticated search implementations for commercial and government clients. The full text of the interview appears below.

What’s your role at Search Technologies?

Search Technologies is an IT services provider focused on search engines. Working with search engines is essentially all we do. We’re technology independent and work with most of the leading vendors, and with open source. The things we do with search engines covers a broad spectrum – from helping companies in need of some expert resources to deliver a project on time, to fully inclusive development projects where we analyze, architect, develop and implement a new search-based solution for a customer, and then provide a fully managed service to administer and maintain the application. If required, we can also host it for the customer, at one of our hosting facilities or in the cloud.

My title is VP, Technology and I am one of the three original founders of the company and have been in the search engine business full-time since 1997. I am responsible for the technical organization, comprised of 70+ people, including Professional Services, Engineering, and Technical Support.

From your point of view, what do customers value most about your services?

We bring hard-won experience to customer projects and a deep knowledge of what works and where the difficult issues lie. Our partners, the major search vendors, sometimes find it difficult to be pragmatic, even where they have their own implementation departments, because their primary focus is their software licensing business. That’s not a criticism. As with most enterprise software sectors, license fees pay for all of the valuable research & development that the vendors put in to keep the industry moving forward. But it does mean that in a typical services engagement, less emphasis is put on the need for implementation planning, and ongoing processes to maintain and fine-tune the search application. We focus only on those elements, and this benefits both customers, who get more from their investment, and search engine partners who end up with happier customers.

In your role as VP of Technology, what achievements are you most proud of?

I’m proud that we have built a company with happy customers, happy employees, and good profits. I’m also proud that we’ve delivered some massively complex projects on time and on budget, even after others have tried and failed. It is gratifying that we have ongoing, multi-year relationships with household names such as the US Government Printing Office, Library of Congress, Comcast, the BBC, and Yellowpages.com.

But our primary achievement is probably the level of expertise of our personnel, along with the methodologies and best practices they use that are now embedded into our company culture. When we engage with customers, we bring experience and proven methodologies with us. That mitigates risks and saves money for customers.

Do you recommend search engines to customers?

Occasionally, but only after conducting what we call an “Assessment. We start from first principles and understand the customer’s circumstances; business needs, data sets, user requirements, infrastructure, existing licensing arrangements, etc. Based on a full knowledge of those issues, we offer independent advice and product recommendations including, where appropriate, open source alternatives.

So you also work with customers who have already chosen a search engine?

This is our primary business. Often, our initial engagement with a customer is to solve a problem; they’ve acquired a software license, spent significant time and money on implementation and are having technical problems and/or trouble meeting their deadlines and budgets. Problems include poor relevancy, performance and scaling issues, security issues, data complexity issues, etc. Probably 70% of our customers first engaged with us by asking us to look at a narrow problem and solve it. Once they discover what we can do and how cost effective we are, they typically expand the scope into implementation of the full solution. We help people to implement best practices to reduce complexity and ownership cost, while dramatically improving the quality of the search service.

So, what’s your secret sauce?

With search projects, usually the secret sauce is that there is no secret sauce. Success is down to hard work and execution at the detail level.

What makes Search Technologies unique?

Sure. If there is any secret to building great search applications, it is usually in showing greater respect for the data and how best to process and enhance it to enable sophisticated search features to work effectively through the front end. That and just experience from hundreds of search application development projects. When a customer hires a Search Technologies Engineer to participate in their project, they are not just getting a well-trained, hard working and hugely experienced individual who writes good code, they are getting access to 80+ technical colleagues in the background with more than 40,000 person-days experience on search projects. We’re great at sharing experiences and best practices – we’ve worked hard at that since the beginning. Also, our staff turnover is really low. People who like working with search engines like it here, and they tend to stick around. That huge body of experience is our differentiation.

So you’re pure services, no software of your own?

In customer engagements we’re pure services. That’s our business. But as a company of largely technical people, of course we’ve developed software along the way. But we do so for the purposes of making our implementation services more efficient, and our support and maintenance services more reliable and sustainable.

Where is the search engine industry heading?

There are now two 800 pound Gorillas in the market, called Microsoft and Google. That’s a big difference from the somewhat fractious market that existed for 10 years ago. That will certainly make it harder for smaller vendors to find oxygen. But at the same time, these very large companies have their own agendas for what features and platforms matter for them and their customers. They will not attempt to be all things to all prospective customers in the same way that smaller hungrier vendors have. In theory this should leave gaps for either products or services companies to fill where specific and relatively sophisticated capabilities are required. We see those requirements all over the place.

Open source (primarily SOLR/Lucene) is making major inroads too. We are seeing a lot of large companies move in this direction.

So is innovation dead?

Not at all. Actually we see lots of companies doing really cool and innovative things with search. Many people have been operating on the assumption that search software would reach a sort of commodity state. Analysts have predicted this for years, that once all the hard problems had been solved, then all search engines would have equivalent capabilities and compete on price. What we’re seeing is very different from that. People are realizing that these problems can’t just be solved and then packaged into an off the shelf solution.

Instead the software vendors are putting a ring fence around the core search functionality and then letting integrators and smart customers go from there. With search, there are now some firmly established basics: Platforms need good indexing pipelines, relevancy algorithms that can be tweaked to suit the audience, navigation options based on metadata, readable, insightful results summaries. But that’s just the starting point for great search.

Here’s an example we’ve been involved with recently. Auto-completion functions have been around for years. You start the search clue, the system suggests what you’re looking for, to help you complete it more quickly. We’ve recently implemented some innovative new ways of doing this, working with a customer who has a specific business need. This includes relevancy ranking and tweaking of auto-completions suggestions, and the inclusion of industry jargon. Influencing search behavior in this way not only helps the customer to provide a very efficient search service, it also supports business goals by promoting particular products and services in context. Think of it as a form or relevancy tuning, but applicable to the search clue and not just the results. These are small tweaks that can have a big impact on the customer’s bottom line.

Another big innovation is SaaS models for search applications. This has also been talked about for years, but is really just now coming into focus in practical ways that customers can leverage.

I understand that your business is growing. Where are you heading and what might Search Technologies look like in a couple of years?

Perhaps the most pleasing thing of all for me personally, is that a lot of our growth, which is averaging 20%+ year on year, comes from perpetuating existing relationships with customers. This speaks well for customer satisfaction levels. We’ve just renewed our Microsoft GOLD partner status, and as a part of that, we conduct a customer satisfaction survey and share the results with Microsoft. The returns this year have been really great. So one of the places we are heading is to build ever longer, deeper relationships with companies for who search is a critical application. We initially engaged with all of our largest customers by providing a few consultant-days of search expertise and implementation services. Today, we provide these same customers with turnkey design and implementation, hosting services, and “hands-off” managed services where all the customer does is use the search application and focus on their core business. This model works really well. Through our experience and focus on search we can run search systems very efficiently and provide a consistently excellent search experience to the customer’s user community. In the future we’ll do a lot more of this.

Finally, tell me something about yourself

I grew up in Michigan, have lived in Chicago, Boston, DC, London and now in San Diego. The best thing about that is I can ride my bike to work most mornings year round. I have two boys (4 years old and 6 months old), neither of whom have the slightest clue what a Michigan winter entails. I expect that will continue for the foreseeable future.

Don C Anderson, August 29, 2011

Sponsored by Search Technologies

Search Innovation: Do IR Thought Leaders Recycle Old Ideas?

August 17, 2011

We are fast approaching our 60th interview in the Search Wizards Speak series. In June 2011, we completed The New Landscape of Enterprise Search, which involved its own series of interviews with engineers, search system customers, chief executive officers, and pundits.

A Paucity of Insights or Fear?

For many years, I have been interviewing entrepreneurs, developers, and investors about information retrieval, content processing, and headache inducing technologies such as entity extraction and natural language processing.

My team of goslings here in Harrod’s Creek the industry leaders like Exalead and some of the more interesting newcomers such as SearchLion. Next week, we release an interview with a fast growing company with headquarters in Europe. Some vendors don’t want to talk; for example, Google and Microsoft. Microsoft was in but then the “expert” disappeared. With the churn at Microsoft, I am just sitting on the sidelines. Other vendors and experts want to talk but don’t want to commit their ideas to a digital interview in a context of scores of other experts’ commentary.

Here’s the trigger for this summary of my thoughts from August 15, 2011. I listened to a podcast this morning when I was walking my trusty technical advisor, Max the Boxer.

image

The on air personality was Adam Carolla. Program is available via iTunes or from the Adam Carolla Web site. The segment of the program which caught my attention was Mr. Carolla’s interview with author and columnist Ben Shapiro, a Harvard lawyer. Mr. Shapiro is the author of Primetime Propaganda: The Tue Hollywood Story of How the Left Took Over Your TV. The air was cool and Mr. Max was chasing squirrels, so I listened to Mr. Shapiro’s observations about how certain well placed individuals feel uncomfortable in their life roles. Mr. Carolla mentioned that he too had observed that some individuals wear their wealth and station in life awkwardly. (I remember reading in 2003 Why Smart Executives Fail, which advanced some similar arguments.)

What’s this have to do with Search Wizards Speak and the interviews I conducted for The New Landscape of Enterprise Search?

I realized that in the interviews I have conducted over the last 32 months, only a few individuals were completely confident in their answers to my now-standardized questions about “What are the major trends in search?” and “What product enhancements will you be introducing in the next release of your product?” In one go round, not only did the interview take nearly four months to complete, the interview subject deleted my standard introduction, deleted my general observations about the interview, and rearranged the content of the interview so that it suppressed any hint of a personal touch for the interview subject. That’s okay with me. The information was interesting and not available elsewhere, so I ran with it.

My Sharpiro’s and Mr. Carolla’s comments struck a nerve because in the search and content processing industry, I think the same type of uncertainty and discomfort exists. Because search is miles away from Wall Street or Hollywood, the experts like Mr. Shapiro ignore software, choosing to focus on high profile topics that cater to a broader audience.

Fuzzy Is Popular

Let’s assume for a moment that Mr. Shapiro’s podcast observation is accurate. What is causing experts in search to be fuzzy, waffling, and uncertain about search and retrieval? (Remember. I am talking about the sample of interviews I have conducted and published, not about forthcoming interviews.)

First, I think that most vendors of search and content processing systems are facing pressures that may be greater that press upon other technology companies. Search and content processing is one of those complex areas which most people dismiss as “been there, done that.” The preeminence of search as a core application has been losing the high ground over the last three or four years. In fact, based on the research we conducted for my new monograph The New Landscape of Search, the shift may be accelerating. Search appears to be more of a utility function. The most successful of the content processing vendors—Exalead, to take one example—embed search in broader, often higher value enterprise solutions. A company selling brute force indexing or a component to improve the indexing of entities is like to find its market becoming less top management level and more information technology staff level. I think this introduces uncertainty in how a search and content processing company can position and price its technology.

image

Thanks to the creative whiz at http://planetpov.com/2011/07/25/uncertainty-in-business-will-it-become-sustainable/

Second, the every day user of a free Web search system or a person doing customer support work in a big company expects a search box. The habit of banging two or three words into the search slot machine and getting out an information payoff is routine. Search and content processing vendors talk a great deal about improving productivity, but the reality is that most users don’t know if the information provided is right or wrong. Most just use what’s at the top of the results list. My hunch is that the increasing dissatisfaction with search is a warning signal that the brute force approach, although ubiquitous, is not working. The client, on the other hand, is okay with good enough. As a result, a vendor trying to explain how to improve a search box function has a long, expensive, and arduous sales process. The top dogs in search and content processing companies want results, but the folks selling the product are not sure what to say to close the deal and keep its options open with other prospects. Not surprisingly, when one reads the nearly 60 interviews, there is a note of sameness that threads through the write ups. The companies that say something different—Autonomy or Exalead, for example—stand out. Many of the others seem quite alike. I will leave it to you to draw your own conclusions.

Read more

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta