Self Service Business Intelligence
May 7, 2008
Space-Time Research, based in Melbourne, Australia, is stepping up its marketing activities for its “self-service business intelligence” system. The announcement is here. On May 6, 2008, the company said that Kerry Araujo, a former SAS and IBM executive, will head STR’s sales operations and partnership development activities. The company also aid that it had name Jack Duncan to lead the STR global alliance development program.
The company says that its core product, SuperSTAR:
provides statistical analysis, visualization, ad hoc analytics, geo-spatial analysis, and reporting. SuperSTAR technology is based on high performance, secure and confidentialized management of very large and/or complex databases.
SRT asserts that it is the “global leader in self-service business intelligence for government”. The idea is that data and content analysis can be performed without the intervention of a programmer. Most “industrial strength” analytic tools are impenetrable to the average manager. Firms like SAS and SPSS are making their systems easier to use, but without a background in analysis and keen math skills, the key functions can’t be tapped unless a specialist wise in the ways of these systems intervenes.
When I was in Australia in November 2007, I heard about this company’s products. The firm has a strong customer base in Australia and in government and public agencies elsewhere in the world. You can learn more by navigating to the company’s Web site here.
Stephen Arnold, May 7, 2008
SharePoint Express Installation Tips
May 7, 2008
Brevity is the soul of wit. You can make your SharePoint experience a really happy one, assert Anindya Roy and Vijay Chauhan, in their “Add Search to Your Intranet”. The article appeared on the May 7, 2008, PCQuest Web site.
Beyond Search’s experience with SharePoint suggests that installation, customization, tuning, and troubleshooting are a wee bit more complicated. The authors say:
By using this server you can crawl and index each and every piece of data on your network and storage pools, and make it searchable through a simple but powerful interface. The data which it can index are web sites, file shares, Exchange public folders, Notes Databases, Office files, XML, HTML, text files, etc. … Besides searching the internal company files, Search Server Express has the ability to search external databases and Web sites.
Nary a mention of Fast Search & Transfer or Fast ESP (enterprise search platform). That’s a work in progress, and I anticipate a follow up on that system when it becomes available. In the meantime, figuring out SharePoint Express, MOSS, and ESS is enough work for me.
Stephen Arnold, May 7, 2008
Enterprise Search and Train Wrecks
May 7, 2008
After I completed my interview with the Intelligenx executives, I thought about one of their comments. Iqbal Talib said, “We have many clients who want a point solution, not an enterprise solution”. An executive at Avalon Consulting wrote me today and echoed the Intelligenx comment.
Enterprise search may be a train wreck for more than half of the people who use today’s most popular systems. The Big Name vendors can grouse, stomp, and sneer at this assertion. Reality: Most of these systems disappoint their licensees. When a search system “goes off the rails”, the consequences can be unexpected.
When an enterprise system goes off the rails, the damage is considerable. Even worse, moving the wreckage out of the way is real work. But even more difficult is earning back the confidence of the passengers.
A Case Example
A major European news organization licensed a Big Name system. The company ponied up a down payment and asked for a fast-cycle installation. After six months of dithering, the Big Name admitted that it did not have an engineer available who could perform the installation and customization the paying customer wanted.
The news organization pulled the plug. The company then licensed one of the up-and-coming systems profiled in Beyond Search. The revamped system was available in less than three weeks at a fraction of the cost for the Big Name system.
The new system works, and it has become a showcase for the news organization. For the Big Name, the loss of the account eroded already shaky finances and became the talk of cocktail parties at industry functions.
Ever wonder how much churn Big Name enterprise search vendors experience in a year? You can get a good idea by comparing the customer lists of the best-known enterprise search vendors. The overlap is remarkable because large companies work their way through the systems. Now more are turning to up-and-coming vendors’ systems. The Big Names are facing some sales push back. Take a look at the financials for publicly traded search vendors. Look for days-sales-outstanding data. Look at the cash reserves. Look at the footnotes about restating financials.
What you may find is that fancy dancing is endemic.
How Many Search Systems Does One Company Need?
What haunts me is the overlap among vendors. Early in 2003, I conducted a poll of Fortune 1000 companies. The methodology was simple: I sent an email with several basic questions to people whom I knew at 150 different large organizations. I received a response rate of about 70 percent, which was remarkable. One question I asked five years ago was, “How many enterprise search systems do you have?”
dtSearch Goes 64 Bit
May 7, 2008
dtSearch, long a staple for developers wanting to embed a full-featured search system in an application, announced its line of 64-bit developer products. Based in Bethesda, Maryland, dtSearch offers a wide range of search solutions.
The company’s technology–profiled in the first three editions of my Enterprise Search Report–offered a solid combination of speedy indexing, fast query processing, and a number of useful features for users, system administrators, and developers. You can add natural language processing functionality to the dtSearch system with technology from Bitext in Madrid, Spain.
The new 64-bit developer edition can support larger indexes, although dtSearch’s engineers had figured out how to crunch large amounts of text in its 32-bit version. The terabyte indexer complements dtSearch’s other functions, including remote data spidering, handling static and dynamic data on publicly-accessible and secure sites, and hit highlighting.
dtSearch has been in business since 1991, and it offers a robust search and retrieval system at competitive prices. You can learn more about the company and its products at the dtSearch Web site.
You can use dtSearch for enterprise search, and you can also license a version of the system to make a CD or DVD stuffed with data searchable. Beyond Search’s experience with this product, particularly in troublesome Microsoft Windows and SharePoint environments, has been positive.
Stephen Arnold, May 7, 2008
US Government Uses AdWords
May 6, 2008
By the time you read this, the estimable Financial Times will have renamed the file, moved it to a digital dungeon, and besiege you with advertisements. The headline that stopped me in my web-footed tracks is, “US Advertises on Google to Snare Surfers”. Click here for what I hope is the original FT link.
The idea is that traffic to a US government site–America.gov–needs to be goosed (no pun intended, dear logo). Do you think the government might use content? Do you think the US government might use backlinks from high-traffic Web sites? Do you think the government might use nifty Web 2.0 features? Keep in mind that this site’s tag line is, “Telling America’s story”.
The answer is, “No.” The US government bids for such zippy terms as terrorism. The person who clicks on an advertisement and gains an insight into the American government’s psyche.
The FT story said:
In recent months the US administration has quietly been running the advertisements for its America.gov site, which is intended to give foreign audiences the Washington take on US foreign policy, culture and society.
I am not doing any government work at this time. I hope someday to meet the consultant who came up with this idea. I will try to get this wizard to take me to lunch. I have a hunch this consultant made some money on this project.
Stephen Arnold, May 6, 2008
Google and Semantics: More Puzzle Pieces Revealed
May 6, 2008
On May 5, 2008, Search Engine Round Table carried an interesting post, “Google Improves Semantic Search”. You can find the post here. The key point is that Google is using truncation “to stem complex plurals”. SEO Round Table points to the Google Groups thread as well. That link is here.
Google’s been active in semantics for a number of years. In 2007, I provided information to the late, great Bear Stearns’ Internet team. Based on my work, Bear Stearns issued a short note about Google’s semantic activity. This document may be available from a Bear Stearns’ broker, if there is one on the job.
An in-depth discussion of five Google semantic-centric inventions appears in Google Version 2.0. This analysis pivots on five patent applications filed in February 2007. A sole inventor, Ramanathan Google, describes a programmable search engine that performs semantic analysis and stores various metadata in a context server. The idea is that the context of a document, a user, or a process provides important insights into the meaning of a document. If you are a patent enthusiast, the five Guha inventions are:
- US2007 00386616, filed on April 10, 2005, and published on February 15, 2007 as “Programmable Search Engine”
US2007 0038601, filed on August 10, 2005, and published on February 15, 2007, as “Aggregating Content Data for Programmable Search Engines”
US2007 0038603, filed on August 10, 2005, and published on February 15, 2007, as “Sharing Context Data across Programmable Search Engines”
US2007 0038600, filed on August 10, 2005, and published on February 15, 2007, as “Detecting Spam-Related and Biased Contents for Programmable Search Engines”
US2007 0038614, filed on August 10, 2005, and published on February 15, 2007, as “Generating and Presenting Advertisements Based on Context Data from Programmable Search Engines”.
These patent documents don’t set a time table for Google’s push into semantics. It is interesting to me that an influential leader in the semantic standards effort invented the PSE or programmable search engine. Dr. Guha, a brilliant innovator, demonstrates that he is capable of doing a massive amount of work in a short span of time. I recall that he joined Google in early 2005, filing more than 130 pages of semantic systems and methods in less than nine months. I grouped these because filing five documents on the same day with each document nudging Google’s semantic invention forward from slightly different angles struck me as interesting.
Stephen Arnold, May 7, 2008
London Times Says Google’s Unhealthy Dominance Will End
May 6, 2008
A cultured journalist, David Rowan, argues that “Google’s unhealthy dominance will end”. Read the story here, before it becomes unfindable in the murky depths of the (London) Times Online, “the news site of the year”. I don’t agree with the conclusion nor do I agree with the reasoning in the article, but it will be important, particularly in London’s financial sweat shops.
The points that jumped out at me cluster under this statement, “They [Google management] feel pretty damn lucky over in Google’s Mountain View headquarters this week.” Here’s my take on the argument presented in this article:
- The Microsoft Yahoo tie up would have been good for Microsoft and bad for Google
- Google’s monopoly is “in none of our interests”
- The changes in information will be significant and Google will play a big part in them
- Microsoft and Yahoo have a chance to develop more products and services “that people actually want”
My thought is that notions of Microsoft and Yahoo building products that people want is partially correct, almost like horse shoes where getting close to the stake earns a point. The problem is that Google is an infrastructure company, and it has an operational advantage and a cost advantage.
You have to be “pretty damn lucky” if you develop a product and expect it to run fast, run economically, and run at scale on the plumbing Microsoft and Yahoo now depend upon. Google’s products and services are a by product of its infrastructure and its engineering. Until the competition figures this out and responds to it with a leap frog solution, Google faces no significant competition from Microsoft or Yahoo. As I argue in Google Version 2.0, Google faces many challenges. These range from keeping staff on the team and productive to inter personal relationships among Messrs. Brin, Page, and Schmidt. A focus on products and services won’t narrow Google’s engineering lead, which I estimate at 12 to 24 months and increasing.
Stephen Arnold, May 6, 2008
Selecting an Enterprise Search System: The Mid-Sized Company Dilemma
May 6, 2008
Earlier today (May 5, 2008) I received a telephone call from a journalist seeking my thoughts about this question: How do mid-sized companies select an enterprise search system. As you know, I call this type of search “behind-the-firewall search”. There’s considerable confusion about Web search, search on a particular Web site, ecommerce search, and the other denizens of the search phylum in the kingdom of information within the domain of knowledge. (I feel biological at the moment after a day of considering how Google vapor sucked oxygen from the Microsoft-Yahoo deal.)
This morning (May 6, 2008), my RSS reader proudly displayed an Information Week article penned by George Dearing “Why Is It So Hard to Be Found“. The article was interesting because Mr. Dearing used the phrase “within the firewall” to describe enterprise search. As you know, I prefer the phrase “behind the firewall”, but he’s close enough for horseshoes. You can read this essay here. Click quickly, articles on the pop-up beseiged Infomation Week Web site can be, as Mr. Dearing notes, “so hard to be found”.
The point he makes that stuck me was:
For something so critical to content as search, you’d think that companies would have more to show for it than misguided enterprise search implementations… I’ve always had a hard time getting my arms around the space, much less the application of specific search-oriented approaches…. Sam Mefford, a search consultant with Avalon Consulting, made me feel a little better recently when he told me, “I’m moving away from the terminology Enterprise Search wherever possible, and moving to just Search, because most organizations simply aren’t ready for Enterprise Search.” After talking to him, it seems the challenge for enterprise search is the same as for other enterprise software sectors: A lot of work was put into technology and software development but the needs of users have largely been ignored.
My bandwagon is no longer holding me. Avalon consulting is on board. I was encouraged by the journalist’s call as well. Some are looking at the search vendors’ assertions and seeing the handiwork of PR mavens, not programmers.
In Beyond Search, I provide quite a bit of tactical information for fixing a search system gone bad. But in the 3rd edition of the Enterprise Search Report, I slogged through the formal procurement process applicable to organizations of almost any substance. And what about the journalist’s questions? The young lass had done her homework. She wanted to know about user dissatisfaction (hovering around 60 percent), methods of selecting systems in mid-sized companies (an underserved sector), and the pay off from a good system (it is easier to explain the cost of not having information access).
The business end of a piranha. Imagine a procurement team swimming in a calm Brazilian river. Above the giant search vendors circle like hungry vultures. In the river, a swarm of ravenous up-and-coming search engine vendors want to nibble on the procurement team. Mid market companies find themselves in the middle when it comes to licensing a search system. Big, aggressive folks above and fiesty smaller ones below make it tough for mid-sized firms to make a well-reasoned search system acquisition.
After the pleasant telephone talk with the reporter, I continued thinking about the characteristics of a mid-sized company. I define “mid sized” as a firm with revenues between $50 million and $300 million in revenues. This is a company size that is caught in the middle. Vulnerable to incursions by far larger companies in search of new revenue, the mid-sized company is a tempting target. IBM, Microsoft, and Oracle have signaled an interest in the companies in this sector. Nibbling away like tiny piranha on the toes of swimmers, start ups and small companies with revenues below the magic $50 million threshold want to gobble the swimmers’ calves, maybe the entire swimmer.
In general, occupying the mid-market requires considerable attention to bigger companies and to an innumerable swarm of smaller outfits. With 15 or 18 million businesses in the US, most are smaller. The 2,000 or 3,000 giant-sized enterprises have the resources to prey on the mid-market.
To survive, mid-market companies have to work hard, deliver acceptable customer service, and market effectively. One slip up, and the weaker mid market company is a snack for a larger organization or a feast for a smaller predator.
Not surprisingly, selecting a behind-the-firewall search system boils down to one of three broad strategies. I’ve substantiated these via my survey and interview work. I am on the look out for more anecdotes, including survey data, that can illuminate the interesting world of mid-market companies.
ZyLAB’s Dr. Johannes Scholtes Interviewed
May 5, 2008
ZyLAB’s chief executive officer, Dr. Johannes Scholtes, said in an exclusive interview for the “Search Wizards Speak” series that the company has more than 7,500 licensees world wide. This customer base puts the company on a par with search sector leaders Autonomy, Fast Search & Transfer (Microsoft), and Google.
He told ArnoldIT.com, sponsor of the Search Wizards Speak series:
Our approach has been to say to our customer, “Here’s our list of components. Just select the ones you need. You pay only for these, so we don’t ask our customers to pay huge fees for functions that will never be used.” Our modular approach is now mature, and I see more vendors in Europe and the US emulating what we’ve been doing for a long time. Our customers tell us our “couple-of-day” deployments are very unusual. For us, fast deployment is business as usual for us. These three and six month installation efforts are problems for many organizations, and these become great sales leads for us.
The failure of key word search to meet the needs of today’s organizations is becoming more well-known. ZyLAB, according, to Dr. Scholtes has pushed beyond the search box. He said:
In the basic search, a user can see the number of hits for a query, hit-density ranking, file date and time for creation, modification, and access. There are many other features in basic mode. For advanced search, you can rank on automatically extracted entities, including names, companies, countries, measurements, dates, monetary amounts, and named-phrases. You can rank by semantic relevance using an automatically derived taxonomy or your own taxonomy. Results can be personalized. You can organize result lists in a variety of ways. You can run a query on a linguistic pattern like “a person got a job” and then rank results in these patterns higher than hits in the full text. Through all this additional meta information, we can support clustering, full text similarity inside documents where precision and recall can be set.
He made the point that ZyLAB’s relevance ranking algorithms are not locked up like those from other well-known vendors.
You can read the full interview on the ArnoldIT.com Web site in the Search Wizards Speak section of the ArnoldIT Web site. This is the 12th interview in the series. An index of the previous interviews is here.
Stephen Arnold, May 5, 2008
MuseGlobal Adheres to Google
May 5, 2008
MuseGlobal, a rapidly-growing content platform company, has teamed with Google integrator Adhere Solutions to deliver next-generation content solutions.
The companies have teamed to create an All Access Connector. With a Google Search Appliance, a bit of OneBox API “magic”, and the Adhere engineering acumen, organizations can deploy a next-generation information access solution.
You can read more about the tie up here. (Hurry, these media announcements can disappear without warning.) This deal will almost certainly trigger a wave of close scrutiny and probably some me-too behavior. Traditional content aggregators and primary publishers have lagged behind the Google “curve” for almost a decade. MuseGlobal’s aggressive move may lead others to take a more pro-active, less combative and defensive posture toward Google. Content providers, mostly anchored in the library world of “standing orders” are struggling as much as traditional publishers to figure out how to generate new revenues as their traditional cash foundations erode beneath them. For some, it may be too late.
You can read about IDC’s “success” here. On the other hand, you can read about the “non success” of the New York Times, for example, here.
Discloser: My son is involved with Adhere. Even more interesting is that I delivered a dose of “Google realty” to a MuseGlobal executive at the recent eContent conference in Scottsdale, Arizona. Obviously some of my analyses of Google as an application platform hit a nerve.
Stephen Arnold, May 5, 2008