Lucid Imagination: Open Source Search Reaches for Big Data

September 30, 2011

We are wrapping up a report about the challenges “big data” pose to organizations. Perhaps the most interesting outcome of our research is that there are very few search and content processing systems which can cope with the digital information required by some organizations. Three examples merit listing before I comment on open source search and “big data”.

The first example is the challenge of filtering information required by orgnaizatio0ns produced within the organization and by the organizations staff, contractors, and advisors. We learned in the course of our investigation that the promises of processing updates to Web pages, price lists, contracts, sales and marketing collateral, and other routine information are largely unmet. One of the problems is that the disparate content types have different update and change cycles. The most widely used content management system based on our research results is SharePoint, and SharePoint is not able to deliver a comprehensive listing of content without significant latency. Fixes are available but these are engineering tasks which consume resources. Cloud solutions do not fare much better, once again due to latency. The bottom line is that for information produced within an organization employees are mostly unable to locate information without a manual double check. Latency is the problem. We did identify one system which delivered documented latency across disparate content types of 10 to 15 minutes. The solution is available from Exalead, but the other vendors’ systems were not able to match this problem of putting fresh, timely information produced within an organization in front of system users. Shocked? We were.

lucid decision copy

Reducing latency in search and content processing systems is a major challenge. Vendors often lack the resources required to solve a “hard problem” so “easy problems” are positioned as the key to improving information access. Is latency a popular topic? A few vendors do address the issue; for example, Digital Reasoning and Exalead.

Second, when organizations tap into content produced by third parties, the latency problem becomes more severe. There is the issue of the inefficiency and scaling of frequent index updates. But the larger problem is that once an organization “goes outside” for information, additional variables are introduced. In order to process the broad range of content available from publicly accessible Web sites or the specialized file types used by certain third party content producers, connectors become a factor. Most search vendors obtain connectors from third parties. These work pretty much as advertised for common file types such as Lotus Notes. However, when one of the targeted Web sites such as a commercial news services or a third-party research firm makes a change, the content acquisition system cannot acquire content until the connectors are “fixed”. No problem as long as the company needing the information is prepared to wait. In my experience, broken connectors mean another variable. Again, no problem unless critical information needed to close a deal is overlooked.

Read more

Funnelback Tunes in Telstra

September 15, 2011

Telstra recently announced its new telecommunications site, www.nowwearetalking.com.au, will be hosted by Funnelback, as reported in the article, Funnelback Launches New Search for Telstra, on Blog Hosting Info. Telstra, Australia’s leading provider of telecommunications and information services, provides basic telephone services, mobile phone services as well as broadband and internet. They pride themselves on their vast geographical coverage of mobile and fixed network infrastructure, and provide

The Funnelback technology, which is a commercial product for sure, will allow users of the new site to search within blog posts, comments, forum posts, and online discussions for information. Funnelback’s promise to clients is that their services “comprehensively tailor the search facility to deliver on your business objectives.”

Funnelback offers commercial search engine services that help companies manage their online activities. As their website explains,

Funnelback can search across a myriad of corporate content repositories including websites, intranets, shared drives, SharePoint, Email systems and databases. For additional flexibility, it can be deployed as a fully managed, SaaS solution, installed within your firewall or hosted in the cloud. No matter how large or small your organization, we can tailor a solution to suit your business needs and information architecture.

As more and more digital data is being sent and received within companies, the need for content management grows. Companies, such as Funnelback, help maximize production and worker effectiveness by allowing humans to work on higher-level thinking projects and the computers sort through the myriad of data.

Catherine Lamsfuss, September 15, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

The Governance Air Craft Carrier: Too Big to Sail?

August 31, 2011

In a few days, I disappear into the wilds of a far off land. In theory, a government will pay me, but I am increasingly doubtful of promises made from 3,000 miles from Harrod’s Creek. As part of the run up to my departure, we held a mini webinar/consultation on Tuesday, August 30, 2011, with a particularly energetic company engaged in “governance.” (SharePoint Semantics has dozens of articles about governance. One example is “A Useful Guide to SharePoint Success from Symon Garfield”. The format of the call was basic. The people on the call asked me questions, and I provided only the perspective of three score years and as many online failures can provide. (I will mention SharePoint but my observations apply to other systems as well; for instance, Documentum, Interwoven, FileNet, etc.)

What I want to do in this short write up is identify a subject that we did not tackle directly in that call, which concerned a government project. However, after the call, I realized that what I call an “air craft carrier” problem was germane to the discussion of automated indexing and entity extraction. An air craft carrier today is a modular construction. The idea is that the flight deck is made by one or more vendors, moved to the assembly point, and bolted down. The same approach is taken with cabins, electronics, and weapon systems.

The basic naval engineering best practice is to figure out how to get the design nailed down. Who wants to have propeller assemblies arrive that do not match the hull clearance specification?

What’s an air craft carrier problem? An air craft carrier is a big ship. It is, according to my colleague Rick Fiust, a former naval officer, a “really big ship.” Unlike a rich person’s yacht or a cruise ship, an air craft carrier does more than surprise with its size. Air craft carriers pack a wallop. In grade school I remember learning the phrase “gun boat diplomacy.” The idea was that a couple of gun boats sends a powerful message.

image

What every content centric system aspires to be. Some information technology professionals will tell their bosses or clients, “You have a state of the art search and content processing system. Everything works.” Unlikely in my experience.

Governance or what I like to think of as “editorial policy” is an air craft carrier. The connotation of governance is broad, involves many different functions, and sends a powerful message. The problem is that when content in an organization becomes unmanageable, the air craft carrier runs aground and the crew is not exactly sure what to to about the problem.

Consider this real life example. A well meaning information technology manager installs SharePoint to allow the professionals in marketing to share their documents, price lists, and snippets from a Web site. Then the company acquires another firm, which runs SharePoint as well as a handful of enterprise applications. On the surface, the situation looks straight forward. However, the task of getting the two organizations’ systems to work smoothly is a bit tricky. There are the standard challenges of permissions and access as well as somewhat more exotic ones of coping with intra-unit indexing and index refreshes. Then a third company is acquired, and it runs SharePoint. Unlike the first two installations which were “by the book”, the third company’s information technology unit used SharePoint as a blank canvas and created specialized features and services, plugged in third party components, and some home grown code.

Now the content issue arises. What content is available, when, to whom, and under what circumstances. Because the SharePoint installation was built in separate modules over time, will these fit together? Nope. There was no equivalent of the naval engineering best practice.

Governance, in my opinion, is the buzz word slapped on content centric systems of which SharePoint is but one example. The same governance problem surfaces when multiple content centric systems are joined.

Will after the fact governance solve the content problems in a SharePoint or other content centric environment? In my experience, the answer is, “Unlikely.” There are four reasons:

Cost. Reworking three systems built on the same platform should be trivial. The work is difficult and in some situations, scrapping the original three systems and starting over may be a more cost effective solution. Who knows what interdependencies lurk within the three systems which are supposed to work as one? Open ended engineering projects are likely to encounter funding problems, and the systems must be used “as is” or fixed a problem at a time.

Read more

Interview with John Steinhauer, Search Technologies

August 29, 2011

Search Technologies Corp., a privately-held firm, continues to widen its lead as the premier enterprise search consulting and engineering services firm. Founded six years ago, the company has grown rapidly. The firms dozens of engineers offer clients deep experience in Microsoft (SharePoint and Fast), Lucene/Solr, Google Search Appliances, and Autonomy systems, among others. Another factor that sets Search Technologies apart is that the company is profitable and debt-free, and its business continues to grow at 20 percent or more each year. It is privately held and headquartered in Herndon, VA.

John-Steinhauer

John Steinhauer, vice president of technology, Search Technologies

John Steinhauer

On August 8, I spoke with John Steinhauer,  vice president of technology of Search Technologies. Before joining Search Technologies, Mr. Steinhauer was the director of product management at Convera. He attended Boston University and the University of Chicago. At Search Technologies, Mr. Steinhauer is Responsible for the day-to-day direction of all technical and customer delivery operations. He manages a growing team of more than 75 engineers and project managers. Mr. Steinhauer is one of the most experienced project directors in the enterprise search space, having been involved with hundreds of sophisticated search implementations for commercial and government clients. The full text of the interview appears below.

What’s your role at Search Technologies?

Search Technologies is an IT services provider focused on search engines. Working with search engines is essentially all we do. We’re technology independent and work with most of the leading vendors, and with open source. The things we do with search engines covers a broad spectrum – from helping companies in need of some expert resources to deliver a project on time, to fully inclusive development projects where we analyze, architect, develop and implement a new search-based solution for a customer, and then provide a fully managed service to administer and maintain the application. If required, we can also host it for the customer, at one of our hosting facilities or in the cloud.

My title is VP, Technology and I am one of the three original founders of the company and have been in the search engine business full-time since 1997. I am responsible for the technical organization, comprised of 70+ people, including Professional Services, Engineering, and Technical Support.

From your point of view, what do customers value most about your services?

We bring hard-won experience to customer projects and a deep knowledge of what works and where the difficult issues lie. Our partners, the major search vendors, sometimes find it difficult to be pragmatic, even where they have their own implementation departments, because their primary focus is their software licensing business. That’s not a criticism. As with most enterprise software sectors, license fees pay for all of the valuable research & development that the vendors put in to keep the industry moving forward. But it does mean that in a typical services engagement, less emphasis is put on the need for implementation planning, and ongoing processes to maintain and fine-tune the search application. We focus only on those elements, and this benefits both customers, who get more from their investment, and search engine partners who end up with happier customers.

In your role as VP of Technology, what achievements are you most proud of?

I’m proud that we have built a company with happy customers, happy employees, and good profits. I’m also proud that we’ve delivered some massively complex projects on time and on budget, even after others have tried and failed. It is gratifying that we have ongoing, multi-year relationships with household names such as the US Government Printing Office, Library of Congress, Comcast, the BBC, and Yellowpages.com.

But our primary achievement is probably the level of expertise of our personnel, along with the methodologies and best practices they use that are now embedded into our company culture. When we engage with customers, we bring experience and proven methodologies with us. That mitigates risks and saves money for customers.

Do you recommend search engines to customers?

Occasionally, but only after conducting what we call an “Assessment. We start from first principles and understand the customer’s circumstances; business needs, data sets, user requirements, infrastructure, existing licensing arrangements, etc. Based on a full knowledge of those issues, we offer independent advice and product recommendations including, where appropriate, open source alternatives.

So you also work with customers who have already chosen a search engine?

This is our primary business. Often, our initial engagement with a customer is to solve a problem; they’ve acquired a software license, spent significant time and money on implementation and are having technical problems and/or trouble meeting their deadlines and budgets. Problems include poor relevancy, performance and scaling issues, security issues, data complexity issues, etc. Probably 70% of our customers first engaged with us by asking us to look at a narrow problem and solve it. Once they discover what we can do and how cost effective we are, they typically expand the scope into implementation of the full solution. We help people to implement best practices to reduce complexity and ownership cost, while dramatically improving the quality of the search service.

So, what’s your secret sauce?

With search projects, usually the secret sauce is that there is no secret sauce. Success is down to hard work and execution at the detail level.

What makes Search Technologies unique?

Sure. If there is any secret to building great search applications, it is usually in showing greater respect for the data and how best to process and enhance it to enable sophisticated search features to work effectively through the front end. That and just experience from hundreds of search application development projects. When a customer hires a Search Technologies Engineer to participate in their project, they are not just getting a well-trained, hard working and hugely experienced individual who writes good code, they are getting access to 80+ technical colleagues in the background with more than 40,000 person-days experience on search projects. We’re great at sharing experiences and best practices – we’ve worked hard at that since the beginning. Also, our staff turnover is really low. People who like working with search engines like it here, and they tend to stick around. That huge body of experience is our differentiation.

So you’re pure services, no software of your own?

In customer engagements we’re pure services. That’s our business. But as a company of largely technical people, of course we’ve developed software along the way. But we do so for the purposes of making our implementation services more efficient, and our support and maintenance services more reliable and sustainable.

Where is the search engine industry heading?

There are now two 800 pound Gorillas in the market, called Microsoft and Google. That’s a big difference from the somewhat fractious market that existed for 10 years ago. That will certainly make it harder for smaller vendors to find oxygen. But at the same time, these very large companies have their own agendas for what features and platforms matter for them and their customers. They will not attempt to be all things to all prospective customers in the same way that smaller hungrier vendors have. In theory this should leave gaps for either products or services companies to fill where specific and relatively sophisticated capabilities are required. We see those requirements all over the place.

Open source (primarily SOLR/Lucene) is making major inroads too. We are seeing a lot of large companies move in this direction.

So is innovation dead?

Not at all. Actually we see lots of companies doing really cool and innovative things with search. Many people have been operating on the assumption that search software would reach a sort of commodity state. Analysts have predicted this for years, that once all the hard problems had been solved, then all search engines would have equivalent capabilities and compete on price. What we’re seeing is very different from that. People are realizing that these problems can’t just be solved and then packaged into an off the shelf solution.

Instead the software vendors are putting a ring fence around the core search functionality and then letting integrators and smart customers go from there. With search, there are now some firmly established basics: Platforms need good indexing pipelines, relevancy algorithms that can be tweaked to suit the audience, navigation options based on metadata, readable, insightful results summaries. But that’s just the starting point for great search.

Here’s an example we’ve been involved with recently. Auto-completion functions have been around for years. You start the search clue, the system suggests what you’re looking for, to help you complete it more quickly. We’ve recently implemented some innovative new ways of doing this, working with a customer who has a specific business need. This includes relevancy ranking and tweaking of auto-completions suggestions, and the inclusion of industry jargon. Influencing search behavior in this way not only helps the customer to provide a very efficient search service, it also supports business goals by promoting particular products and services in context. Think of it as a form or relevancy tuning, but applicable to the search clue and not just the results. These are small tweaks that can have a big impact on the customer’s bottom line.

Another big innovation is SaaS models for search applications. This has also been talked about for years, but is really just now coming into focus in practical ways that customers can leverage.

I understand that your business is growing. Where are you heading and what might Search Technologies look like in a couple of years?

Perhaps the most pleasing thing of all for me personally, is that a lot of our growth, which is averaging 20%+ year on year, comes from perpetuating existing relationships with customers. This speaks well for customer satisfaction levels. We’ve just renewed our Microsoft GOLD partner status, and as a part of that, we conduct a customer satisfaction survey and share the results with Microsoft. The returns this year have been really great. So one of the places we are heading is to build ever longer, deeper relationships with companies for who search is a critical application. We initially engaged with all of our largest customers by providing a few consultant-days of search expertise and implementation services. Today, we provide these same customers with turnkey design and implementation, hosting services, and “hands-off” managed services where all the customer does is use the search application and focus on their core business. This model works really well. Through our experience and focus on search we can run search systems very efficiently and provide a consistently excellent search experience to the customer’s user community. In the future we’ll do a lot more of this.

Finally, tell me something about yourself

I grew up in Michigan, have lived in Chicago, Boston, DC, London and now in San Diego. The best thing about that is I can ride my bike to work most mornings year round. I have two boys (4 years old and 6 months old), neither of whom have the slightest clue what a Michigan winter entails. I expect that will continue for the foreseeable future.

Don C Anderson, August 29, 2011

Sponsored by Search Technologies

MarkLogic, FAST, Categorical Affirmatives, and a Direction Change

July 5, 2011

I weakened this morning (July 4, 2011) with a marketing Fourth of July boom. I received one of those ever present LinkedIn updates putting a comment from the Enterprise Search Engine Professionals Group in front of me.

image

The MarkLogic positioning exploded on my awareness like a Fourth of July skyrocket’s burst.

Most of the comments on the LinkedIn group are ho hum. One hot topic has been Microsoft’s failure to put much effort in its blogs about Fast Search & Transfer’s technology. Snore. Microsoft put down $1.2 billion for Fast, made some marketing noises, and had a fellow named Mr. Treo-something talk to me about the “new” Fast Search system. Then search turned out to be more like a snap in but without the simplicity of a Web part. Microsoft moved on and search is there, but like Google’s shift to Android, search is not where the action is. I am not sure who “runs” the enterprise search unit at Microsoft. Lots of revolving door action is my impression of Microsoft’s management approach in the last year.

The noise died down and Fast has become another component in the sprawling Shanghai of code known as SharePoint 2010. Making Fast “fast” and tuning it to return results that don’t vary with each update has created a significant amount of business for Microsoft partners “certified” to work on Fast Search. Licensees of the Linux/Unix version of ESP are now like birds pushed from the next by an impatient mother.

New MarkLogic Market Positioning?

Set Microsoft aside for a moment and look at this post from a MarkLogic professional who once worked at Fast Search and subsequently at Microsoft. I am not sure how to hyperlink to LinkedIn posts without generating a flood of blue and white screens begging for log in, sign up, and money. I will include a link, but you are on your own.

Here’s the alleged MarkLogic professional’s comment:

Many organizations are replacing FAST with MarkLogic. MarkLogic offers a scalable enterprise search engine with all the features of FAST plus more…

Wow.

An XML engine with wrappers is now capable of “all” the Fast features. In my new monograph “The New Landscape of Enterprise Search”, I took some care to review information presented by Fast at CERN, the wizard lair in Europe, about Fast Search’s effort to rewrite Fast ESP, which was originally a Web search engine. The core was wrapped to convert Web search into enterprise search. This was neither quick nor particularly successful. Fast Search & Transfer ran into some tough financial waters, ended up the focus of a government investigation, and was quickly sold for a price that surprised me and the goslings in Harrod’s Creek.

You can get the details of the focus of the planned reinvention of the Fast system and the link to the source document at CERN which I reference in my Landscape study. A rewrite indicates that some functions were not in 2007 and 2008 performing in  a manner that was acceptable to someone in Fast Search’s management. Then the acquisition took place. The Linux/Unix support was nuked. Fast under Microsoft’s wing has become a utility in the incredible assemblage of components that comprises SharePoint 2010. I track the SharePoint ecosystem in my information service SharePointSemantics.com. If you haven’t seen the content, you might want to check it out.

Read more

The Columns of Arnold: June 2011

May 24, 2011

My for fee columns for my June 2011 deadlines are completed. No easy task with the final corrections for The New Landscape of Search flowing through the Harrod’s Creek underground cellar. This 180 page report will be priced at $20 US and 15 euros. The Pandia ordering information will be available in a few days. Now to the columns:

Smart Business Network’s column is “When With It Marketing Won’t Work”. The main point is that for many small businesses digital marketing methods are less effective than more traditional methods. The column talks about the reasons and provides some sources of information about doing non-digital selling. Spoiler alert: Newspapers and other tabloids may have cause to rejoice.

KMWorld’s column is “Image Recognition Semantics: A Job for Smart Software or an Average Human.” Google announced enhancements to its image search, then it received a US patent for a method to recognize celebrities, and at almost the same time, Google’s chairperson dumped cold water on image recognition. I review where enterprise image recognition is and provide examples of systems that work quite well. Spoiler alert: Exalead and Cognex get the nod from me.

Information Today’s column is “Google’s Shallow Draughts: Its Shift from Search to Knowledge.” I take a look at Google shift from search to knowledge. My focus is what this means for searchers and for advertisers. I won’t give any details about this write up, but I do reference Heidegger, who also struggled with knowledge.

Enterprise Technology Management’s column is “Google, the Chromebook, and the Cloud: Time as Justice.” You  may recognize the reference to As You Like It. I look at Google’s proliferation of cloud devices at the same time its Blogger.com cloud publishing system crashed and was off line for 20 hours. Reality is different from what companies “like”.

No column required for the six times a year Online Magazine. We have stepped up content production on Inteltrax.com and SharePointSemantics.com. In addition, later this week we will roll out an investment centric blog called HighGainBlog.com. These blogs will be similar to Beyond Search; that is, we will not do original news. We will comment on important trends and issues in the various niches we cover. At this time, we are producing a significant amount of SharePoint information, which is interesting because the system is the subject of so many articles that talk about issues, concerns, glitches, etc.

We have added brief biographical sketches on our Writer’s Page. I have a couple of questions along the lines of “How do you produce so much content by yourself in the hollow in rural Kentucky?” The answer is, “I don’t.” My name turns up on many of the online news items, but that’s a production issue, not a signal that the addled goose is actually working more than a couple of hours a day.

Hey, I have to paddle in the goose pond.

Stephen E Arnold, May 24, 2011

Freebie unlike the reports and the for fee columns which I do write. To understand the intent of the ArnoldIT.com blogs, read the About page.

Search: An Information Retrieval Fukushima?

May 18, 2011

Information about the scale of the horrific nuclear disaster in Japan at the Fukushima Daiichi nuclear complex is now becoming more widely known.

Expertise and Smoothing

My interest in the event is the engineering of a necklace of old-style reactors and the problems the LOCA (loss of coolant accident) triggered. The nagging thought I had was that today’s nuclear engineers understood the issues with the reactor design, the placement of the spent fuel pool, and the risks posed by an earthquake. After my years in the nuclear industry, I am quite confident that engineers articulated these issues. However, the technical information gets “smoothed” and simplified. The complexities of nuclear power generation are well known at least in engineering schools. The nuclear engineers are often viewed as odd ducks by the civil engineers and mechanical engineers. A nuclear engineer has to do the regular engineering stuff of calculating loads and looking up data in hefty tomes. But the nukes need grounding in chemistry, physics, and math, lots of math. Then the engineer who wants to become a certified, professional nuclear engineer has some other hoops to jump through. I won’t bore you with the details, but the end result of the process produces people who can explain clearly a particular process and its impacts.

image

Does your search experience emit signs of troubles within?

The problem is that art history majors, journalists, failed Web masters, and even Harvard and Wharton MBAs get bored quickly. The details of a particular nuclear process makes zero sense to someone more comfortable commenting about the color of Mona Lisa’s gown. So “smoothing” takes place. The ridges and outcrops of scientific and statistical knowledge get simplified. Once a complex situation has been smoothed, the need for hard expertise is diminished. With these simplifications, the liberal arts crowd can “reason” about risks, costs, upsides, and downsides.

image

A nuclear fall out map. The effect of a search meltdown extends far beyond the boundaries of a single user’s actions. Flawed search and retrieval has major consequences, many of which cannot be predicted with high confidence.

Everything works in an acceptable or okay manner until there is a LOCA or some other problem like a stuck valve or a crack in a pipe in a radioactive area of the reactor. Quickly the complexities, risks, and costs of the “smoothed problem” reveal the fissures and crags of reality.

Web search and enterprise search are now experiencing what I call a Fukushima event. After years of contentment with finding information, suddenly the dashboards are blinking yellow and red. Users are unable to find the information needed to do their job or something as basic as locate a colleague’s telephone number or office location. I have separated Web search and enterprise search in my professional work.

I want to depart for a moment and consider the two “species” of search as a single process before the ideas slip away from me. I know that Web search processes publicly accessible content, has the luxury of ignoring servers with high latency, and filtering content to create an index that meets the vendors’ needs, not the users’ needs. I know that enterprise search must handle diverse content types, must cope with security and access controls, and perform more functions that one of those two inch wide Swiss Army knives on sale at the airport in Geneva. I understand. My concern is broader is this write up. Please, bear with me.

Read more

Landscape

May 17, 2011

The New Landscape of Enterprise Search. A Critical Review of the Market and Search Systems will be available in a few weeks.

To get a free copy, just sign up for our monthly newsletter. Write thehonk at yandex.com.

This 125 page monograph was published by Pandia.com. Pandia has closed its publishing operation. The 2011 report provides an overview of the enterprise search market at a time when many vendors walk a knife edge of profitability and other vendors have either failed like Convera and Delphes are in a somewhat frantic quest for additional funding.

In a time of considerable financial duress, an enterprise search system is an important part of many organizations’ operations. However, search vendors are using a diverse, often Madison Avenue approach to explaining information retrieval. To make the landscape more interesting, there are hundreds of companies offering broad solutions and an equal number selling eDiscovery, customer support systems, business intelligence systems, and sentiment analysis solutions, among others.

cover 5 10 C

Scrape away the marketing jargon, and these systems are often quite similar. Dig more deeply and you will discover that some solutions use open source software wrapped in proprietary code. Other vendors license third party tools from specialists and essentially “package up” solutions which are pitched as a cohesive whole. Little wonder most enterprise search systems generate dissatisfaction levels among their users of 50 percent, 65 percent, and even higher.

The principal chapters of the report are:

  1. A preface. This explains why my team at ArnoldIT.com and I wrote another book about search. In the last half dozen years we have generated multiple editions of the now defunct Enterprise Search Report which ballooned to a massive 600 pages when printed out, the Beyond Search report about value-added indexing for the Gilbane Group, Successful Enterprise Search Management for Galatea in the UK, and our third analysis of Google technology in Google: The Digital Gutenberg. The reason rests with the type of information that is now circulating about major search systems and enterprise search. We wanted to try and provide an anchor point for today’s procurement minded professionals.
  2. An introduction. We have pulled information from our annual review of the search sector which we prepare for our clients each year and additional current information about the market, hot sectors, and the problem “big data” poses to organizations regardless of their revenues or number of employees.
  3. Autonomy. We review the guts of Autonomy’s Integrated Data Operating Layer and provide facts about why the company is able to sustain solid growth and deliver search technology to more than 20,000 customers.
  4. Endeca. We talk about the “under the covers” aspect of Endeca’s Guided Navigation. We explore how Endeca has penetrated eCommerce, search, and business intelligence. Unlike Autonomy, Endeca is a privately held company and has been the victim of a “glass ceiling” in certain aspects of its business.
  5. Exalead. Like Google, Exalead based its revolutionary approach on experiences its founders gleaned working on other search and retrieval methods. After its purchase by Dassault Systemès in 2010, Exalead exploded into a market niche described as “search based applications.” The chapter dissects the “plumbing” of Exalead and identifies how its next generation technology is pushing the company toward new types of information integration, including augmented reality.
  6. Google. The information in this chapter departs from the pure technical dissection of Google in my three Google monographs. There is a strong technical component but we present pricing and a frank discussion about the commitment Google has to make to the Google Search Appliance to make it a cost effective option for organizations. The information about Google’s cloud-based search initiative and the 2011 search appliance pricing provides a view different from what is offered by Google’s public relations.
  7. Microsoft. The focus is on the Fast Search & Transfer SA system which is the carrier-class search solution for SharePoint licensees. We look at what Microsoft Fast Search Server is and we document what is different from the old, pre-implosion Fast Search. We gathered information that explains why Fast Search was beginning a complete rewrite of the core Fast Search system prior to the acquisition by Microsoft. What happened to that project? We reveal that in this chapter.
  8. Vivisimo. The company has a new management team and is now pushing aggressively into enterprise search. Unlike some vendors, Vivisimo has kept a focus on search and added features to make Vivisimo useful in customer support and eDiscovery applications. Is Vivisimo a solid search solution or a clever utility packaged like many other vendors’ technology as a Swiss Army knife?
  9. Outlook. In this chapter we provide a glimpse of the search landscape as tomorrow’s sun breaks the horizon. Search as a stand alone solution is not casting a long shadow. What will the future hold for today’s leaders and the hundreds of companies chasing the search brass ring? We try to answer the question. Our view may surprise but not shock you.

The volume also contains a listing of resellers and partners of the profiled vendors. This information is often needed when a problem arises or a new feature or function is required. The listing also provides stark evidence of the “footprint” each of the vendors has in specific market sectors. To our knowledge, such data have not previously been collected.

We have also prepared a table listing another two dozen vendors of enterprise search. For each vendor we describe its core positioning and provide essential facts such as the firm’s url. In a sense, this table provides a summary of the key points in my other analyses of key vendors and their systems.

Finally, we took our various glossaries, updated them, and compiled a fresh list of terms and definitions. The jargon of search is one of the signals that vendors are struggling to make sales. The glossary provides short explanations of important terms. Our approach is not academic. We intended to craft explanations that will allow a person who is not an expert in information retrieval translate the explanations some vendors provide.

Are we confident that this report is the last word about search?

No, of course not.  Search is among the most complex information challenges organizations, developers, and researchers face. In fact, the weakness of the report comes from the decision to focus on six vendors and emphasize the processing of unstructured textual information. We do touch upon the challenge of rich media, but that is an aspect of search that looms as a significant technical hurdle for a number of companies. Only Autonomy and Exalead have developed mature solutions to rich media processing. The other vendors lag behind these two engineering-centric firms.

To get a free copy, write thehonk@yandex.com. Note. When you request a free book, you will be opting into our new email, restricted distribution weekly newsletter about search and content processing.

Updated, February 3, 2012

BA-Insight Sees Opportunity through Azure Colored Glasses

May 9, 2011

It seems that BA Insight is embracing the media marketing trend as they showcase their new technology on Microsoft Channel 9. The interview and article “Building On Azure: BA Insight” which are located on the Microsoft Channel 9 Web Site provide some interesting details about the new search technology. BA Insight integrated its new search technology into FAST and SharePoint 2010. A passage that caught my attention was:

BA Insight’s advanced user interface, which, among other things, removes the burden of having to download content to assess relevance. Using this technology, individual pages, slides, or worksheets can be previewed without downloading the entirety of any one file.

Cloud computing through Microsoft Office 365 and the Windows Azure Platform allow BA Insight to handle heavy workloads efficiently. The cloud is still a relatively new technology but the possible implications of the technology could provide Microsoft customers with notable options. However, the cloud computing problems that have struck the very popular Amazon do raise doubt but maybe Azure can prove that there is light at the end of the tunnel?

Is the cloud the future of computing? It seems to make sense for organizations struggling to contain computing costs and cope with staffing challenges. However, the assumption is that organizations can afford the bandwidth and the risk of losing a connection when a big deal is in the balance. Google is cheerleading for cloud computing as well.

What happens when a cloud based search system is unavailable? Employees will have to scramble. The big deal may be saved but at what cost? Will senior managers and CFOs listen and act? Sure, until there is an Amazon event. Everything works on paper and in PowerPoint presentations. The real world often behaves in unexpected ways.

Alice Holmes, May 9, 2011

Freebie

TERIS and Clearwell Announce New eDiscovery Tool

March 18, 2011

TERIS is a fifteen year old, national eDiscovery software manufacturer who partnered with Clearwell Systems in 2010.  Acquiring a gold level of certification thru Microsoft, TERIS has since worked to expand the Clearwell platform with the additions of its own series of review software.

In a repost in the SF Chronicles Web site titled “TERIS Enhances Service Offering With Latest Release of the Clearwell eDiscovery Platform“, their current endeavor is detailed. Version 6.1 gives clients the means to tenably gather information from the Microsoft Business Productivity Online Suite (BPOS)/MO 365.  We felt this passage summarizes the product’s aims:

With this new capability, TERIS’s clients can quickly identify and collect data from Microsoft Exchange Online and Microsoft SharePoint Online for e-discovery requests in response to litigation, regulatory inquiries and internal investigations.  Once collected, the data from the cloud is immediately available for downstream e-discovery phases such as processing, analysis, review and production.  As a result, the Clearwell E-Discovery Platform frees TERIS’s clients to reap the benefits of cloud computing while still fulfilling their legal and compliance requirements related to e-discovery.

This new version also plays well with SharePoint Online, going so far as to offer auto-detection for its sites within the cloud.  Reference the article for a complete listing of the support features connected to Microsoft BPOS which is, repeat three times, Microsoft Business Productivity Online Standard Suite.

Sarah Rogers, March 18, 2011

Freebie

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta