Hewlett Packard Trim 7
March 12, 2010
Hewlett Packard, a company that I continue to associate with low cost printers and high cost ink, lit up my radar with its acquisition of Lexington, Kentucky-based Exstream Software two years ago. Exstream (now Enterprise Document Automation), like IBM Ricoh Infoprints and Streamserve, generates outputs like invoices with warranty reminders and auto payment bills with coupons for oil change discounts. I learned that in February 2010, HP stepped up its footprint in document management. One of the source documents I examined is “HPTrim 7… How We Got Here?”. The gray background and the dark blue highlights on text were a bit much for the addled goose’s eyes, however. For me, the most interesting segment in the history of Trim 7 was this passage:
Market consolidation meant that lots of little players were gobbled up, as the larger vendors strived to meet the ever challenging demands of the marketplace, picking up technology from these smaller companies and making them a part of their overall product line. Hewlett-Packard, one of the largest IT companies in the world, did the same, acquiring TOWER Software in 2008, but with one subtle difference. Rather than cannibalize the technology and abandon the product, they kept almost all of the staff from the TOWER acquisition and told them to build the next version of what is now known as HP TRIM. And – there were no other products that HP TRIM had to compete with internally unlike a lot of the other acquisitions: IBM/FileNet, OpenText/Hummingbird/Vignette, and utonomy/Zantaz/Interwoven/Meridio. HP wanted to concentrate on the product that was HP TRIM, and add the backing that only a company like HP can bring to a product. And so, HP TRIM 7 was born.
Digging through the text, HP bought an outfit called Tower and is rolling in other software to create the “new” document management business. You can locate the main page here. Three points jumped out:
First, I did not see any indication that HP’s dynamic document system integrates our “touches” the Trim 7 product. That’s strike me as an indication that HP is chasing revenues from silo sales, not integration.
Second, how does one find a document? I could not locate any information about the search and retrieval functions within Trim 7. I surmise that if I use Trim 7 for SharePoint, I in theory would be able to use the Microsoft Fast ESP system to search for content. That also seems to be quite a bit of work; that is, consulting revenue for HP or its partners. My query “search HP Trim” resulted in 10 hits but noting on point. One result was this page, which was heavy on marketing an light on locating information within the Trim 7 system. After a legal eagle drops a gift on a company named as a party in a legal matter, job one is answering the question, “What’s this about?” Trim 7 may not be able to answer that question.
Third, HP seems to be grabbing enterprise software companies that address really big information problems. With HP’s push into printers and ink, I saw a success that may have caught the firm’s hardware mavens by surprise. The trajectory in enterprise software is being driven from bit money acquisitions. I think that the surprise of printing consumables will be different from the surprise of acquisition-based growth. One was emergent; the latter is closer to MBA spreadsheet fever.
Big bets. Big win or big loss? I am leaning toward the loss option. Outlook: worth monitoring.
Stephen E Arnold, March 12, 2010
No one paid me to write this. Because HP derives significant revenue from ink, I think I have to report non payment to the US government’s printer, GPO.
Is Content Management a Digital Titanic?
February 25, 2010
Content management is a moving target. Unlike search, CMS is supposed to generate a Web page or some other type of content product. The “leaders” in content management systems or CMS seem to disappearing into larger organizations. Surprising. If CMS were healthy, why aren’t these technology outfits growing like crazy and spinning off tons of cash?
I am no expert in CMS. In fact, I am not an expert in anything unlike the azure chip consultants, poobahs, and pundits who profess deep knowing at the press of a mouse button. In my experience, CMS emerged from people not having an easy way to produce HTML pages that could be displayed in a browser.
If HTML was too tough for some people, imagine the pickle barrel in which these folks find themselves today. In order to create a Web site, more than HTML is required. The crowd who relied on Microsoft’s Front Page find themselves struggling with the need to make Web pages work as applications or bundles of applications with some static brochureware thrown in for good measure.
To make a Web site today, technical know how is an absolute must. Even the very good point-and-click services from SquareSpace.com and Weebly.com can baffle some people.
The azure chip consultants, the mavens, and the poobahs want to be in the lifeboats. Women and children to the rear. Source: http://www.ronnestam.com/wp-content/uploads/2009/02/lifeboat_change_advertising_sinking.jpg
Move the need for a dynamic Web site into a big organization that is not good at technology, and you have a recipe for disaster. In fact, the wreckage created by some content management vendors, pundits, and integrators is of significant magnitude. There’s the big hassle in Australia over a blue chip CMS implementation that does not work. The US Senate went after the bluest of the blue chip integrators because a CMS could not generate a single Web page. Sigh.
A Free Pass for Open Source Search?
February 11, 2010
Dateline: Harrod’s Creek, February 11, 2010
I read Gavin Clarke’s “Microsoft Drops Open Source Birthday Gift with Fast Lucidly Imaginative?” I think that the point of the story was “a free pass” to “open source search providers like Lucid Imagination” is interesting. However, I am not willing to accept “free pass”, a variant of the “free lunch” in my opinion.
Here’s my view from the pleasant clime of snowy Harrod’s Creek.
First, in my opinion, most of the Fast Search & Transfer licensees bought into the “one size fits all” approach to search: facets, reports, access to structured and unstructured data, etc. As many of these licensees discovered, the cost of making Fast’s search technology deliver on the marketing PowerPoints was high. Furthermore, some like me learned how difficult it was for certain licensees to get the moving parts in sync quickly. Fast ESP consisted, prior to the Microsoft buy out, of keyword search, semantics from a team in Germany, third-party magic from companies like Lexalytics, home brew code from Norwegian wizards, and outright acquisitions for publishing and content management functionality. Wisely, many search vendors have learned to steer clear of the path that Fast Search & Transfer chopped through the sales wilderness. This means that orphaned Fast Search licensees may be looking at procurements that narrow the scope of search and content processing systems. In fact, there are only a handful vendors who are now pitching the “kitchen sink” approach to search.
Source: http://www.graceforlife.com/uploaded_images/no_free_lunch-772769.jpg
Second, open source search solutions are not created equal. Some are tool kits; others are ready-to-run systems. Lucid Imagination has a good public relations presence in certain places; for example, San Francisco. For those who monitor the search space, there are some other open source vendors that may provide some options. I particularly like the open source version of Lucene available from Tesuji.eu. Ah, never heard of the outfit, right? I also find the FLAX system available from Lemur Consulting useful as well. I think the issues with Fast Search & Transfer are not going to be resolved by ringing up a single vendor and saying, “We’re ready to go with your open source solution.” The more prudent approach is going to be understanding what the differences among various open source search solutions are and then determining if an organization’s specific requirements match up to one of these firms’ service offerings. Open source, therefore, requires some work and I don’t think a knee jerk reaction or a sweeping statement that the Microsoft announcement will deliver a “free pass” is accurate.
Quote to Note: Dick Brass on MSFT Innovation
February 6, 2010
I met Dick Brass many years ago. He left Oracle and joining Microsoft to contribute to a confidential initiative. Mr. Brass worked on the ill-fated Microsoft tablet, which Steve Jobs has reinvented as a revolutionary device. I am not a tablet guy, but one thing is certain. Mr. Jobs knows how to work public relations. Mr. Brass published an article in the New York Times, and it captured the attention of Microsoft and millions of readers who enjoyed Mr. Brass’s criticism of his former employer. I have no opinion about Microsoft, its administrative methods, or its ability to innovate. I did find a quote to note in the write up:
Microsoft is no longer considered the cool or cutting edge place to work. There has been a steady exist of its best and brightest. (“Microsoft’s Creative Destruction”, the New York Times, February 4, 2010, Page 25, column 3, National Edition)
Telling because if smart people don’t work at a company, that company is likely to make less informed decisions than an organization with smarter people. This applies in the consulting world. There are blue chip outfits like McKinsey, Bain, and BCG). Then there are lesser outfits which I am sure you can name because these companies “advertise”, have sales people who “sell” listings, and invent crazy phrases to to create buzz and sales. I am tempted to differentiate Microsoft with a reference to Apple or Google, but I will not. Oh, why did I not post this item before today. The hard copy of my New York Times was not delivered until today. Speed is important in today’s information world.
The quote nails it.
Stephen E Arnold, February 7, 2010
No one paid me to write this, not a single blue chip consulting firm, not a single savvy company. I will report this lack of compensation to the experts at the IRS, which is gearing up for the big day in April.
* Featured
* Interviews
* Profiles
Featured
Microsoft and Mikojo Trigger Semantic Winds across Search Landscape
Semantic technology is blowing across the search landscape again. The word “semantic” and its use in phrases like “semantic technology” has a certain trendiness. When I see the word, I think of smart software that understands information in the way a human does. I also think of computationally sluggish processes and the complexity of language, particularly in synthetic languages like English. Google has considerable investment in semantic technology, but the company wisely tucks it away within larger systems and avoiding the technical battles that rage among different semantic technology factions. You can see Google’s semantic operations tucked within the Ramanathan Guha inventions disclosed in February 2007. Pay attention to the discussion of the system and method for “context”.
image
Gale force winds from semantic technology advocates. Image source: http://www.smh.com.au/ffximage/2008/11/08/paloma_wideweb__470×289,0.jpg
Microsoft’s Semantic Puff
Other companies are pushing the semantic shock troops forward. I read yesterday in Network World’s “Microsoft Talks Up Semantic Search Ambitions.” The article reminded me that Fast Search & Transfer SA offered some semantic functionality which I summarized in the 2006 version of the original Enterprise Search Report (the one with real beef, not tofu inside). Microsoft also purchased Powerset, a company that used some of Xerox PARC’s technology and its own wizardry to “understand” queries and create a rich index. The Network World story reported:
With semantic technologies, which also are being to referred to as Web 3.0, computers have a greater understanding of relationships between different information, rather than just forwarding links based on keyword searches. The end game for semantic search is “better, faster, cheaper, essentially,” said Prevost, who came over to Microsoft in the company’s 2008 acquisition of search engine vendor Powerset. Prevost is still general manager of Powerset. Semantic capabilities get users more relevant information and help them accomplish tasks and make decisions, said Prevost.
The payoff is that software understands humans. Sounds good, but it does little to alter the startling dominance of Google in general Web search and the rocket like rise of social search systems like Facebook. In a social context humans tell “friends” about meaning or better yet offer an answer or a relevant link. No search required.
I reported about the complexities of configuring the enterprise search system that Microsoft offers for SharePoint in an earlier Web log post. The challenge is complexity and the time and money required to make a “smart” software system perform to an acceptable level in terms of throughput in content processing and for the user. Users often prefer to ask someone or just use what appears in the top of a search results list.
Read more »
Interviews
Inside Search: Raymond Bentinck of Exalead, Part 2
This is the second part of the interview with Raymond Bentinck of Exalead.
Isn’t this bad marketing?
No. This makes business sense.Traditional search vendors who may claim to have thousands of customers tend to use only a handful of well managed references. This is a direct result of customers choosing technology based on these overblown marketing claims and these claims then driving requirements that the vendor’s consultants struggle to deliver. The customer who is then far from happy with the results, doesn’t do reference calls and ultimately becomes disillusioned with search in general or with the vendor specifically. Either way, they end up moving to an alternative.
I see this all the time with our clients that have replaced their legacy search solution with Exalead. When we started, we were met with much skepticism from clients that we could answer their information retrieval problems. It was only after doing Proof of Concepts and delivering the solutions that they became convinced. Now that our reputation has grown organizations realize that we do not make unsubstantiated claims and do stick by our promises.
What about the shift to hybrid solutions? An appliance or an on premises server, then a cloud component, and maybe some fairy dust thrown in to handle the security issues?
There is a major change that is happening within Information Technology at the moment driven primarily by the demands placed on IT by the business. Businesses want to vastly reduce the operational cost models of IT provision while pushing IT to be far more agile in their support of the business. Against this backdrop, information volumes continue to grow exponentially.
The push towards areas such as virtual servers and cloud computing are aspects of reducing the operational cost models of information technology provision. It is fundamental that software solutions can operate in these environments. It is surprising, however, to find that many traditional search vendors solutions do not even work in a virtual server environment.
Isn’t this approach going to add costs to an Exalead installation?
No, because another aspect of this is that software solutions need to be designed to make the best use of available hardware resources. When Exalead provided a solution to the leading classified ads site Fish4.co.uk, unlike the legacy search solution we replaced, not only were we able to deploy a solution that met and exceeded their requirements but we reduced the cost of search to the business by 250 percent. A large part of this was around the massively reduced hardware costs associated with the solution.
What about making changes and responding quickly? Many search vendors simply impose a six month or nine month cycle on a deployment. The client wants to move quickly, but the vendor cannot work quickly.
Agility is another key factor. In the past, an organization may implement a data warehouse. This would take around 12 to 18 months to deploy and would cost a huge amount in hardware, software and consultancy fees. As part of the deployment the consultants needed to second guess the questions the business would want to ask of the data warehouse and design these into the system. After the 12 to 18 months, the business would start using the data warehouse and then find out they needed to ask different types of questions than were designed into the system. The data warehouse would then go through a phase of redevelopment which would last many more months. The business would evolve… making more changes and the cycle would go on and on.
With Exalead, we are able to deploy the same solution in a couple months but significantly there is no need to second guess the questions that the business would want to ask and design them into the system.
This is the sort of agile solution that businesses have been pushing their IT departments to deliver for years. Businesses that do not provide agile IT solutions will fall behind their competitors and be unable to react quickly enough when the market changes.
One of the large UK search vendors has dozens of niche versions of its product. How can that company keep each of these specialty products up to date and working? Integration is often the big problem, is it not?
The founders of Exalead took two years before starting the company to research what worked in search and why the existing search vendors products were so complex. This research led them to understand that the search products that were on the marketplace at the time all started as quite simple products designed to work on relatively low volumes of information and with very limited functional capabilities. Over the years, new functionality has been added to the solutions to keep abreast of what competitors have offered but because of how the products were originally engineered they have not been clean integrations. They did not start out with this intention but search has evolved in ways never imagined at the time these solutions were originally engineered.
Wasn’t one of the key architects part of the famous AltaVista.com team?
Yes. In fact, both of the founders of Exalead were from this team.
What kind of issues occur with these overly complex products?
As you know, this has caused many issues for both vendors and clients. Changes in one part of the solution can cause unwanted side effects in another part. Trying to track down issues and bugs can take a huge amount of time and expense. This is a major factor as to why we see the legacy search products on the market today that are complex, expensive and take many months if not years to deploy even for simple requirements.
Exalead learned from these lessons when engineering our solution. We have an architecture that is fully object-orientated at the core and follows an SOA architecture. It means that we can swap in and out new modules without messy integrations. We can also take core modules such as connectors to repositories and instead of having to re-write them to meet specific requirements we can override various capabilities in the classes. This means that the majority of the code that has gone through our quality-management systems remains the same. If an issue is identified in the code, it is a simple task to locate the problem and this issue is isolated in one area of the code base. In the past, vendors have had to rewrite core components like connectors to meet customers’ requirements and this has caused huge quality and support issues for both the customer and the vendor.
What about integration? That’s a killer for many vendors in my experience.
The added advantage of this core engineering work means that for Exalead integration is a simple task. For example, building new secure connectors to new repositories can be performed in weeks rather than months. Our engineers can take this time saved to spend on adding new and innovative capabilities into the solution rather than spending time worrying about how to integrate a new function without affecting the 1001 other overlaying functions.
Without this model, legacy vendors have to continually provide point-solutions to problems that tend to be customer-specific leading to a very expensive support headache as core engineering changes take too long and are too hard to deploy.
I heard about a large firm in the US that has invested significant sums in retooling Lucene. The solution has been described on the firm’s Web site, but I don’t see how that engineering cost is offset by the time to market that the fix required. Do you see open source as a problem or a solution?
I do not wake up in the middle of the night worrying about Lucene if that is what you are thinking! I see Lucene in places that have typically large engineering teams to protect or by consultants more interested in making lots of fees through its complex integration. Neither of which adds value to the company in, for example, reducing costs of increasing revenue.
Organizations that are interested in providing cost effective richly functional solutions are in increasing numbers choosing solutions like Exalead. For example, The University of Sunderland wanted to replace their Google Search Appliance with a richer, more functional search tool. They looked at the marketplace and chose Exalead for searching their external site, their internal document repositories plus providing business intelligence solutions over their database applications such as student attendance records. The search on their website was developed in a single day including the integration to their existing user interface and the faceted navigation capabilities. This represented not only an exceptionally quick implementation, far in excess of any other solution on the marketplace today but it also delivered for them the lowest total cost of ownership compared to other vendors and of course open-source.
In my opinion, Lucene and other open-source offerings can offer a solution for some organizations but many jump on this bandwagon without fully appreciating the differences between the open source solution and the commercially available solutions either in terms of capability or total cost. It is assumed, wrongly in many instances, that the total cost of ownership for open source must be lower than the commercially available solutions. I would suggest that all too often, open source search is adopted by those who believe the consultants who say that search is a simple commodity problem.
What about the commercial enterprise that has had several search systems and none of them capable of delivering satisfactory solutions? What’s the cause of this? The vendors? The client’s approach?
I think the problem lies more with the vendors of the legacy search solutions than with the clients. Vendors have believed their own marketing messages and when customers are unsatisfied with the results have tended to blame the customers not understanding how to deploy the product correctly or in some cases, the third-party or system integrator responsible for the deployment.
One client of ours told me recently that with our solution they were able to deliver in a couple months what they failed to do with another leading search solution for seven years. This is pretty much the experience of every customer where we have replaced an existing search solution. In fact, every organization that I have worked with that has performed an in-depth analysis and comparison of our technology against any search solution has chosen Exalead.
In many ways, I see our solution as not only delivering on our promises but also delivering on the marketing messages that our competitors have been promoting for years but failing to deliver in reality.
So where does Exalead fit? The last demo I received showed me search working within a very large, global business process. The information just appeared? Is this where search is heading?
In the year 2000, and every year since, a CEO of one of the leading legacy search vendors made a claim that every major organization would be using their brand of meaning based search technology within two years.
I will not be as bold as him but it is my belief that in less than five years time the majority of organizations will be using search based applications in mission critical applications.
For too long software vendors have been trying to convince organizations, for example, that it was not possible to deploy mission critical solutions such as customer 360 degree customer view, Master Data Management, Data Warehousing or business intelligence solutions in a couple months, with no user training, with with up-to-the-minute information, with user friendly interfaces, with a low cost per query covering millions or billions of records of information.
With Exalead this is possible and we have proven it in some of the world’s largest companies.
How does this change the present understanding of search, which in my opinion is often quite shallow?
Two things are required to change the status quo.
Firstly, a disruptive technology is required that can deliver on these requirements and secondly businesses need to demand new methods of meeting ever greater business requirements on information.
Today I see both these things in place. Exalead has proven that our solutions can meet the most demanding of mission critical requirements in an agile way and now IT departments are realizing that they cannot support their businesses moving forward by using traditional technologies.
What do you see as the trends in enterprise search for 2010?
Last year was a turning point around Search Based Applications. With the world-wide economy in recession, many companies have put projects on hold until things were looking better. With economies still looking rather weak but projects not being able to be left on ice for ever, they are starting to question the value of utilizing expensive, time consuming and rigid technologies to deliver these projects.
Search is a game changing technology that can deliver more innovative, agile and cheaper solutions than using traditional technologies. Exalead is there to deliver on this promise.
Search, a commodity solution? No.
Editor’s note: You can learn more about Exalead’s search enable applications technology and method at the Exalead Web site.
Stephen E Arnold, February 4, 2010
I wrote this post without any compensation. However, Mr. Bentinck, who lives in a far off land, offered to buy me haggis, and I refused this tasty bribe. Ah, lungs! I will report the lack of payment to the National Institutes of Health, an outfit concerned about alveoli.
Profiles
Vyre: Software, Services, Search, and More
A happy quack to the reader who sent me a link to Vyre, whose catchphrase is “dissolving complexity.” The last time I looked at the company, I had pigeon holed it as a consulting and content management firm. The news release my reader sent me pointed out that the company has a mid market enterprise search solution that is now at version 4.x. I am getting old, or at least too sluggish to keep pace with content management companies that offer search solutions. My recollection is that Crown Point moved in this direction. I have a rather grim view of CMS because software cannot help organizations create high quality content or at least what I think is high quality content.
The Wikipedia description of Vyre matches up with the information in my archive:
VYRE, now based in the UK, is a software development company. The firm uses the catchphrase “Enterprise 2.0? to describe its enterprise solutions for business.The firm’s core product is Unify. The Web based services allows users to build applications and content management. The company has technology that manages digital assets. The firm’s clients in 2006 included Diageo, Sony, Virgin, and Lowe and Partners. The company has reinvented itself several times since the late 1990s doing business as NCD (Northern Communication and Design), Salt, and then Vyre.
You can read Wikipedia summary here. You can read a 2006 Butler Group analysis here. My old link worked this evening (March 5, 2009), but click quickly. In my files I had a link to a Vyre presentation but it was not about search. Dated 2008, you may find the information useful. The Vyre presentations are here. The link worked for me on March 5, 2009. The only name I have in my archive is Dragan Jotic. Other names of people linked to the company are here. Basic information about the company’s Web site is here. Traffic, if these data are correct, seem to be trending down. I don’t have current interface examples. The wiki for the CMS service is here. (Note: the company does not use its own CMS for the wiki. The wiki system is from MedioWiki. No problem for me, but I was curious about this decision because the company offers its own CMS system. You can get a taste of the system here.
image
Administrative Vyre screen.
After a bit of poking around, it appears that Vyre has turned up the heat on its public relations activities. The Seybold Report here presented a news story / news release about the search system here. I scanned the release and noted this passage as interesting for my work:
…version 4.4 introduces powerful new capabilities for performing facetted and federated searching across the enterprise. Facetted search provides immediate feedback on the breakdown of search results and allows users to quickly and accurately drill down within search results. Federated search enables users to eradicate content silos by allowing users to search multiple content repositories.
Vyre includes a taxonomy management function with its search system, if I read the Seybold article correctly. I gravitate to the taxonomy solution available from Access Innovations, a company run by my friend and colleagues Marje Hlava and Jay Ven Eman. Their system generates ANSI standard thesauri and word lists, which is the sort of stuff that revs my engine.
Endeca has been the pioneer in the enterprise sector for “guided navigation” which is a synonym in my mind for faceted search. Federated search gets into the functions that I associated with Bright Planet, Deep Web Technologies, and Vivisimo, among others. I know that shoving large volumes of data through systems that both facetize content and federated it are computationally intensive. Consequently, some organizations are not able to put the plumbing in place to make these computationally intensive systems hum like my grandmother’s sewing machine.
If you are in the market for a CMS and asset management company’s enterprise search solution, give the company’s product a test drive. You can buy a report from UK Data about this company here. I don’t have solid pricing data. My notes to myself record the phrase, “Sensible pricing.” I noted that the typical cost for the system begins at about $25,000. Check with the company for current license fees.
Stephen Arnold, March 6, 2009
Latest News
Mobile Devices and Their Apps: Search Gone Missing
VentureBeat’s “A Pretty Chart of Top Apps for iPhone, Android, BlackBerry” shocked me. Not a little. Quite a bit. You will want to look at the top apps f
Lazarus, Azure Chip Consultants, and Search
January 8, 2010
A person called me today to tell me that a consulting firm is not accepting my statement “Search is dead”. Then I received a spam email that said, “Search is back.” I thought, “Yo, Lazarus. There be lots of dead search vendors out there. Example: Convera.
Who reports that search has risen? An azure chip consultant! Here’s what raced through my addled goose brain as I pondered the call and the “search is back” T shirt slogan:
In 2006, I was sitting on a pile of research about the search market sector. The data I collected included:
- Interviews with various procurement officers, search system managers, vendors, and financial analysts
- My own profiles of about 36 vendors of enterprise search systems plus the automated content files I generate using the Overflight system. A small scale version is available as a demo on ArnoldIT.com
- Information I had from my work as a systems engineering and technical advisor to several governments and their search system procurement teams
- My own experience licensing, testing, and evaluating search systems for clients. (I started doing this work after we created in 1993 The Point (Top 5% of the Internet) and sold it to Lycos, a unit of CMGI. I figured I should look into what Lycos was doing so I could speak with authority about its differences from BRS/Search, InQuire, Dialog (RECON), and IBM STAIRS III. I had familiarity with most of these systems through various projects in my pre Point (Top 5% of the Internet life).
- My Google research funded by the now-defunct BearStearns outfit and a couple of other well heeled organizations.
What was clear in 2006 was the following:
First, most of the search system vendors shared quite a bit of similarity. Despite the marketing baloney, the key differentiators among the flagship systems in 2006 were minor. Examples range from their basic architecture to their use of stemming to the methods of updating indexes. There were innovators, and I pointed out these companies in my talks and various writings, including the three editions of the Enterprise Search Report I wrote before I fell ill in February 2007 and quit doing that big encyclopedia type publication. These similarities made it very clear to me that innovation for enterprise search was shifting from the plain old key word indexing of structured records available since the advent of RECON and STAIRS to a more freeform approach with generally lousy relevance.
Get information access wrong, and some folks may find a new career. Source: http://www.seeing-stars.com/Images/ScenesFromMovies/AmericanBeautyMrSmiley%28BIG%29.JPG
Second, the more innovative vendors were making an effort in 2006 to take a document and provide some sort of context for it. Without a human indexer to assign a classification code to a document that is about marketing but does not contain the word “marketing”, this was rocket science. But when I examined these systems, there were two basic approaches which are still around today. The first was to use statistical methods to put documents together and make inferences and the other was a variation on human indexing but without humans doing most of the work. The idea was that a word list would contain synonyms. There were promising demonstrations of software methods that could “read” a document, but there were piggy and of use where money was no object.
Third, the Google approach which used social methods—that is, a human clicking on a link—were evident but not migrating to the enterprise world. Google was new but to make their 2006 method hum, lots of clicks were needed. In the enterprise, most documents never get clicked, so the 2006 Google method was truly lousy. Google has made improvements, mostly by implementing the older search methods, not by pushing the envelope as it has been doing with its Web search and dataspace efforts.
Fourth, most of the search vendors were trying like Dickens to get out of a “one size fits all” approach to enterprise search. Companies making sales were focusing on a specific niche or problem and selling a package of search and content searching that solved one problem. The failure of the boil the ocean approach was evident because user satisfaction data from my research funded by a government agency and other clients revealed that about two thirds of the users of an enterprise search system were dissatisfied or very dissatisfied with that search system. The solution, then, was to focus. My exemplary case was the use of the Endeca technology to allow Fidelity UK sales professionals to increase their productivity with content pushed to them using the Endeca system. The idea was that a broker could click on a link and the search results were displayed. No searching required. ClearForest got in the game by analyzing the dealer warranty repair comments. Endeca and ClearForest were harbingers of focus. ClearForest is owned by Thomson Reuters and in the open source software game too.
When I wrote the article in Online Magazine for Barbara Quint, one of my favorite editors, I explained these points in more detail. But it was clear that the financial pressures on Convera, for example, and the difficulty some of the more promising vendors like Entopia were having made the thin edge of survival glint in my desk lamp’s light. Autonomy by 2006 had shifted from search and organic growth to inorganic growth fueled by acquisitions that were adjacent to search.
Can Microsoft and Its Petascale Financial Services Mining Project Succeed
December 1, 2009
The goslings and I were chattering and quacking in Harrod’s Creek. One of our cousins was killed and eaten for an American holiday. What a way for our beloved friend and colleague to go: deep fried in an oil drum behind the River Creek Inn.
As we were recalling the best moments in Theodore the Turkey’s life, we discussed the likelihood of Microsoft’s petascale content mining project hitting a home run. The ideas, as we addled geese understand it, is that Microsoft wants to process lots of content and generate high value insights for the money crazed MBAs in the world’s leading financial institutions.
The project tackles a number of tough technical problems; for example, getting around the inherent latency in petascale systems, dealing with the traditional input output balkiness of Windows plumbing, and crunching enough data with sufficient accuracy to make the exercise worth the time of the financial client. You may find my earlier post germane.
Other outfits are in this game as well. Some are focused on the hardware / firmware / software side like Exegy. Others provide toolkits like Kapow Technologies. Some Beltway Bandits operate low profile content filtering systems for governmental and commercial clients. And there is the old nemesis, Googzilla, happily chewing through one trillion documents every few days. Finally, some of the financial institutions themselves have pumped support into outfits like Connotate. Even the struggling Time Warner owns some nifty technology in the Relegence unit. So, what’s new?
Three thoughts as I prepare to comment about the push into perceived real time processing at the International Online
Show:
- The cost of slashing latency with any type of content is going to be one expensive proposition. Not even some governments have the cash to muscle up a serve with terabytes of RAM. Yep, terabytes.
- Figuring out what process left another process gasping for air requires some programmers who can plow through code looking for an errant space, a undefined variable, or a bit of an Assembler hack that push when it should have popped
- Latency outside the span of control of the system can render some outputs just plain wrong. Delay is bad; bad outputs are even worse.
If you have not been tracking Microsoft’s big initiatives, you may want to spend some more time grinding through the ACM and other scholarly papers such as “Towards Loosely Coupled Coupled Programming on Petascale Systems. and poking around on the Microsoft Web site. To find useful stuff, I use the Google Microsoft index. If you aren’t familiar with it, check it out here.
I wonder if this stuff will be part of SharePoint 2011 and available as a Microsoft Fast ESP plug in?
Stephen Arnold, December 1, 2009
Yes, oh, yes. Let me disclose to the National Institute of Science and Technology that I was not paid to write this humorous essay. Consider it a name day present. If I am late, that’s latency. If I am early, that’s predictive output.
Microsoft and the Cloud Burger: Have It Your Way
November 19, 2009
I am in lovely and organized Washington, DC, courtesy of MarKLogic. The MarkLogic events pull hundreds of people, so I go where the action is. Some of the search experts are at a search centric show, but search is a bit yesterday in my opinion. There’s a different content processing future and I want to be prowling that busy boulevard, not sitting alone on a bench in the autumn of a market sector.
The MarkLogic folks wanted me to poke my nose into its user meeting. That was a good experience. And now I am cooling my heels for a Beltway Bandit client. I have my watch and my wallet. With peace of mind, I thought I would catch up on my newsreader goodies.
I read with some surprise “Windows Server’s Plan to Move Customers Back Off the Cloud” in beta news. As I understand the news story, Microsoft wants its customers to use the cloud, the Azure service. Then when fancy strikes, the customer can license on premises software and populate big, hot, expensive to maintain servers in the licensee’s own data center. I find the “have it your own way” appealing. I was under the impression that the future was the cloud. If I understand this write up, the cloud is not really the future. The “future” is the approach to computing that has been here since I took my first computer programming class in 1963 or so.
I found this passage in the article interesting:
If you write your code for Windows Server AppFabric, it should run on Windows Azure,” said Ottaway, referring to the new mix-and-match composite applications system for the IIS platform. “What we are delivering in 2010 is a CTP [community technology preview] of AppFabric, called Windows Azure AppFabric, where you should be able to take the exact same code that you wrote for Windows Server AppFabric, and with zero or minimal refactoring, be able to put it up on Windows Azure and run it.” AppFabric for now appears to include a methodology for customers to rapidly deploy applications and services based on common components. But for many of these components, there will be analogs between the on-Earth and off-Earth versions, if you will, such that all or part of these apps may be translated between locales as necessary.
Note the “shoulds”. Also, there’s a “may be”. Great. What does this “have it your own way” mean for enterprise search?
First, I don’t think that the Fast ESP system is going to be as adept as either Blossom, Exalead, or Google at indexing and serving results from the cloud for enterprise customers. The leader in this segment is not Google. I would give the nod to Blossom and Exalead. There’s no “should” with these systems. Both deliver.
Second, the latency for a hybrid application when processing content is going to be an interesting challenge for those brave enough to tackle the job. I recall some issues with other vendors’ hybrid systems. In fact, these performance problems were among the reasons that these vendors are not exactly thriving today. Sorry, I cannot mention names. Use your imagination or sift through the articles I have written about long gone vendors.
Third, Microsoft is working from established code bases and added layers—wrappers, in my opinion—to these chunks of code that exist. That’s an issue for me because weird stuff can happen. Yesterday one Internet service provider told me that his shop was sticking with SQL Server 2000. “We have it under control”, he said. With new layers of code, I am not convinced that those building a cloud and on premises solution using SharePoint 2010 and the “new” Fast ESP search system are going to have stress free days.
In short, more Microsoft marketing messages sound like IBM’s marketing messages. Come to think of it hamburger chains have a similar problem. I think this play is jargon for finding ways to maximize revenues, not efficiencies for customers. When I go to a fast food chain, no matter what I order, the stuff tastes the same and delivers the same health benefits. And there’s a “solution accelerator.” I will have pickles with that. Just my opinion.
Stephen Arnold, November 19, 2009
I hereby disclose to the Internal Revenue Service and the Food and Drug Administration that this missive was written whilst waiting for a client to summon me to talk about topics unrelated to this post. This means that the write up is a gift. Report it as such on your tax report and watch your diet.
Analyst Underestimates Impact of Microsoft and Google in Enterprise Search
October 2, 2009
I have written a report for the Gilbane Group and think highly of the firm’s analysis. However, a recent news item forwarded to me by a reader of this Web log triggered some conversation in Harrod’s Creek today. The article was “Competition among Search Vendors,” and it was dated in early August 2009. The article included this assertion:
This additional news item indicates that Microsoft is still trying to get their search strategy straightened out with another new acquisition, Applied Discovery Selects Microsoft FAST for Advanced E-Discovery Document Search. E-discovery is a hot market in legal, life sciences and financial verticals but firms like ISYS, Recommind, Temis, and ZyLab are already doing well in that arena. It will take a lot of effort to displace those leaders, even if Microsoft is the contender. Enterprises are looking for point solutions to business problems, not just large vendors with a boatload of poorly differentiated products.
In the last three months, I have noticed an increase in the marketing activity from quite a few enterprise search and content processing vendors. One one hand, there are SharePoint surfers. These are vendors who don’t want to be an enterprise solution. The vendors are content to license their technology to some of the 100 million SharePoint licensees. Forget search. SharePoint is a market that can be harvested.
On the other hand, there are vendors in the process of changing their stripes, and not just once. Some companies have moved from intelligence applications to marketing to sentiment to call center applications. I have to tell you. I don’t know if some of these companies are even in the search business any more.
Looming over the entire sector are the shadows of Google and Microsoft. Each has a different strategy for sucking revenue blood from the other. Smaller vendors are going to need agility to avoid getting hurt when these two semi-clueless giants rampage across the enterprise to do battle to one another.
The notion that niches will support the more than 300 companies in the search and content processing market may be optimistic. I wanted to use a screen shot from the TV show Fantasy Island but I won’t. A number of search vendors are gasping for oxygen now. I am keen to keep a sunny outlook on life. But when it comes to search and content processing realism is useful. Just ask some of the musical chair executives making the rounds among the search and content processing companies.
What will Google and Microsoft do to get shelf space in the enterprise? What do big companies with deep pockets typically do? Smaller companies can’t stay in this winner take all game.
Stephen Arnold, October 2, 2009
Exclusive Interview with SurfRay President
September 29, 2009
SurfRay has come roaring back into the search and content processing sector. SurfRay, like many other companies, had to tighten its belt and straighten its tie in the wake of the global financial turmoil. With the release of new versions of Ontolica (a search system that snaps into SharePoint) and MondoSearch (a platform independent Web site search solution), SurfRay is making sales again. ArnoldIT.com spoke with Søren Pallesen about the firm’s new products. In a wide ranging interview, Mr. Pallesen, a former Gartner Group consultant, said:
SurfRay’s mission is to deliver tightly packaged search software solutions for businesses to provide effective search for both internal and external users. With Packaged we mean easy to try, install and use. Our vision is to be our customer’s first choice for packaged enterprise search solutions and to become the world’s third largest search solution provider in the world measured on number of paying business customers by 2012. The last six months have been an exciting time for SurfRay. I took over as CEO; we significantly increased investment in product development and an ambitious expansion of the organization. This has paid off. SurfRay is profitable, and we have released new versions of our core products. Ontolica is now in version 4.0, including a new suite of reporting and analytics, and MondoSearch 5.4 is in beta for a Q4 release. As a profitable company we are in the fortunate position to be able to fund our own growth and we are expanding in North America among other by hiring more sales people as well as formation of a Search Expert Center in Vancouver, Canada that will serve customers across the Americas. We are also expanding in Europe most recently with formation of SurfRay UK and Ireland, allowing us to expand sales and support with local people on the ground in this important European market.
When asked about the difference between MondoSearch and Ontolica, Mr. Pallesen told me:
Customers that buy our products typically fall into a number of usage scenarios. Simply put Ontolica solves search problems inside the firewall and MondoSearch outside the firewall. Firstly customers with SharePoint implementations look for enhanced search functionality, and turn to our Ontolica for SharePoint product. Secondly, businesses that do not use SharePoint but have the need for an internal search solution on an intranet, file servers, across email, applications and other sources buy Ontolica Express and use it in combination with Microsoft Search Server Express for simple single server installation or Micro Search Server for multiple load balanced server installations. Thirdly, customers with the need for robust and highly configurable web site search buy MondoSearch. Especially popular with businesses that want to implement up- and cross selling on their search results page.
You can read the full text of the interview in the Search Wizards Speak series on ArnoldIT.com. For more information about SurfRay, visit the company’s revamped Web site at http://www.surfray.com.
Stephen Arnold, September 29, 2009
Explaining the Difference between Fast ESP and MOSS 2007 Again
September 7, 2009
When a company offers multiple software products to perform a similar function, I get confused. For example, I have a difficult time explaining to my 88 year old father the differences among Notepad, WordPad, Microsoft Works’ word processing, Microsoft Word word processing, and the Microsoft Live Writer he watched me use to create this Web log post. I think it is an approach like the one the genius at Ragu spaghetti sauce used to boost sales of that condiment. When my wife sends me to the store to get a jar of Ragu spaghetti sauce, I have to invest many minutes figuring out what the heck is the one I need. Am I the only male who cannot differentiate between Sweet Tomato Basic and Margherita? I think Microsoft has taken a different angle of attack because when I acquired a Toshiba netbook, the machine had installed Notepad, WordPad, and Microsoft Works. I added a version of Office and also the Live Writer blog tool. Some of these were “free” and others products came with my MSDN subscription.
Now the same problem has surfaced with basic search. I read “FAST ESP versus MOSS 2007 / Microsoft Search Server” with interest. Frankly I could not recall if I had read this material before, but quit a bit seemed repetitive. I suppose when trying to explain the differences among word processors, the listener hears a lot of redundant information as well.
The write up begins:
It took me some time but i figured out some differences between Microsoft Search Server / MOSS 2007 and Microsoft FAST ESP. These differences are not coming from Microsoft or the FAST company. But it came to my notice that Microsoft and FAST will announce a complete and correct list with these differences between the two products at the conference in Las Vegas next week.These differences will help me and you to make the right decisions at our customers for implementing search and are based on business requirements.
Ah, what’s different is that this is a preview of the “real” list of differences. Given the fact that the search systems available for SharePoint choke and gasp when the magic number of 50 million documents is reached, I hope that the Fast ESP system can handle the volume of information objects that many organizations have on their systems at this time.
The list in the Bloggix post numbers 14. Three interested me:
- Scalability
- Faceted navigation
- Advanced federation.
Several observations:
First, scalability is an issue with most search systems. Some companies have made significant technical breakthroughs to make adding gizmos painless and reasonably economical. Other companies have made the process expensive, time consuming, and impossible for the average IT manager to perform. I heard about EMC’s purchase of Kazeon. I thought I heard that someone familiar with the matter pointed to problems with the Fast ESP architecture as one challenge for EMC. In order to address the issue, EMC bought Kazeon. I hope the words about “scalability” are backed up with the plumbing required to deliver. Scaling search is a tough problem, and throwing hardware at hot spots is, at best, a very costly dab of Neosporin.
Second, faceted navigation exists within existing MOSS implementations. I think I included screenshots of faceted navigation in the last edition of the Enterprise Search Report I wrote in 2006 and 2007. There was a blue interface and a green interface. Both of these made it possible to slice and dice results by clicking on an “expert” identified by counting the number of documents a person wrote with a certain word in them. There were other facets available as well, although most we more sophisticated that the “expert” function. I hope that the “new” Fast ESP implements a more useful approach for users of Fast ESP. Of course, identifying, tagging, and linking facets across processed content requires appropriate computing resources. That brings us back to scaling, doesn’t it? Sorry.
Third, federation is a buzz word that means many different things because vendors define the term in quite distinctive ways. For example, Vivisimo federates, and it is or was at one time a metasearch system. The query went to different indexing services, brought back the results, deduplicated them, put the results in folders on the fly, and generated a results list. Another type of federation surfaces in the descriptions of business intelligence systems offered by SAS. The system blends structured and unstructured data within the SAP “environment”. Others are floating around as well, including the repository solutions from TeraText which federates disparate content into one XML repository. What I find interesting is that Microsoft is not delivering “federation” which is undefined. Microsoft is, according to the Bloggix post, on the trail of “advanced federation”. What the heck does that mean. The explanation is:
FAST ESP supports advanced federation including sending queries to various web search APIs, mixing results, and shallow navigation. MOSS only supports federation without mixing of results from different sources and navigation components, but showing them separately.
Okay, Vivisimo and SAP style for Fast ESP; basic tagging for MOSS. Hmm.
To close, I think that the Fast ESP product is going to add a dose of complexity to the SharePoint environment. Despite Google’s clumsy marketing, the Google Search Appliance continues to gain traction in many organizations. Google’s solution is not cheap. People want it. I think Fast ESP is going to find itself in a tough battle for three reasons:
- Google is a hot brand, even within SharePoint shops
- Microsoft certified search solutions are better than Fast ESP based on my testing of search systems over the past decade
- The cost savings pitch is only going to go so far. CFOs eventually will see the bills for staff time, consulting services, upgrades, and search related scaling. In a lousy financial environment, money will be a weak point.
I look forward to the official announcement about Fast ESP, the $1.2 billion Microsoft spent for this company is now going to have to deliver. I find it unfortunate that the police investigation of alleged impropriety at Fast Search & Transfer has not been resolved. If a product is so good as Fast ESP was advertised to be, what went wrong with the company, its technology, and its customer relations prior to the Microsoft buy out? I guess I have to wait for more information on these matters. When you have a lot of different products with overlapping and similar services, the message I get is more like the Ragu marketing model, not the solving of customer problems in a clear, straightforward way. Sigh. Marketing, not technology, fuels enterprise search these days I fear.
Stephen Arnold, September 7, 2009