One Reason Why Microsoft May Not Make Search a Success

June 26, 2008

The Bill Gates “noise” echoed in Kentucky. I read PCMag.com’s “Exclusive: The Bill Gates Exit Interview” here. The interview merits your attention. I zoned out with references to “the platform” and choked when I encountered this comment: “Everything in computer science is to just write less code.” And I was baffled with references to a “natural user interface”. But I am a Kentucky hill billy.

I tried to avoid reading about “Bill Gates’ Web Experience”. Michael Krigsman does work I enjoy, but I was hooked. Mr. Krigsman pulled the best bits from a PDF of an email exchange here. I discovered that this was a “flame” among Microsofties. You can read SeattlePI.com’s take on the exchange and learn why the PDF has confidential stamped on it.

I read the emails and ignored the complaints about Mr. Gates’s problems using Windows XP. What’s new?

The email put the PCMag.com interview into perspective for me. Here is the key line in the email thread. One Microsoftie writes, “I am owning the website issues.” [sic].

Now, for me the telling comment is in a response to this person’s attempt to provide leadership, accept responsibility for the mess, and fix the problem. Ready, here is what a Microsoft employee identified as Mike Beckerman wrote: “I don’t know what it means to ‘own website issues…‘”. I have added the emphasis.

Now my observations:

  1. I am no leader, but I recognize that the person stepping forward to assume responsibility is walking and talking like a leader. Leaders are good because good leaders make things happen. For a colleague not to know what it means to “own Web site issues” is snide. In some organizations, the comment would be close to insubordinate.
  2. When colleagues cannot cede control, preferring to keep the status quo, the management process is in danger of veering off track. The email exchange took place in 2003 and now it is 2008. The Yahoo deal flopped. Vista is an issue for some. The enterprise search and Web search initiatives are spinning their wheels. I would assert that these are examples of flawed management and a refusal for colleagues to sort out their differences and find a leader to guide them forward.
  3. Google may have some challenges ahead. But if this email exchange is accurate (it may be a hoax for all I know), Microsoft may have some trouble closing the gap with Google in advertising, search, and cloud-based services. Google is a great many things, but so far it has avoided the headwind caused by employees who disregard a plea for changes from the fellow who founded the company.

Hopefully, I won’t have to read any more about Mr. Gates’s retirement, which I believe, has him on the Microsoft campus two or three days a week. Oh, the problems identified in the 2003 “flame” emails are still around. No one was able to fix them. Well, there is always next year, which is what IBM said about OS/2.

Stephen Arnold, June 26, 2008

Google: Snuggling with OCLC

June 25, 2008

Digital Document Quarterly, Volume 7, Number 2, 2008 provided this item:

OCLC and Google have agreed to exchange book discovery data. Google will link from Google Book Search to WorldCat, which will drive traffic to online library services. Google will also share digitized book data. WorldCat will represent OCLC member library collections and link books scanned by Google. A user who finds a book in Google Book Search will be able to use WorldCat to find local library copies.

You can read the DDQ at http://home.pacbell.net/hgladney/ddq_7_2.htm. I recommend the publication if you have an interest in the library side of online information and digital documents.

My view of this is that slowly, ever so slowly, Google is encroaching on the traditional database world. I am confident the management gurus at ProQuest, Ebsco Electronic Publishing, Newsbank, and the other firms servicing this important but shrinking market has a GPS device on Googzilla.

A happy quack to H.M. Gladney from the Beyond Search goose.

Stephen Arnold, June 25, 2008

Hosted SharePoint

June 25, 2008

Tired of trying to figure out where SharePoint put a file? Relief is available from an outfit called SharePoint 360, company offering cloud-based SharePoint. You can read about the company here. SharePoint 360 is Microsoft Gold Certified partner. If this service takes off, Microsoft will move forward with more software as a service offerings. Details about the hosted SharePoint service are here.

The company says:

Our approach allows for even the most non-technical users to quickly get started and feel as comfortable working with Microsoft SharePoint as they do with Word or Excel.

The service warrants a test drive.

Stephen Arnold, June 25, 2008

IBM Search: A Trial of Patience for Customers

June 25, 2008

A quick question. What is the url for IBM’s public Web search? Ah, you did not know that IBM had a Web search system. I did. IBM’s crawler once paid a quick visit to my Web site years ago. You can use this service yourself. Navigate to http://www.ibm.com/search. The service is called the IBM Planetwide Web.

Let us run a test query. My favorite test query is for an IBM server called the PC704. I once owned two of these four processor Pentium Pro machines. For years I wanted to upgrade the memory to a full gigabyte, so I became a regular Sherlock Holmes as I tried to find memory I could afford.

Here are the results for this query PC704.

ibm results

The screen shot is difficult to read, but there is one result–a reference in an IBM technical manual. Let us click on the link. We get a link to a manual about storage sub systems. I know that IBM discontinued the PC704, but the fact that there is no archive of technical information about this system is only slightly less baffling than the link to the storage documentation.

Let’s try another query. Navigate to http://www.ibm.com. We are greeted with a different splash screen with an option to “sign in” and a search box. Let’s run a new query “text mining”. The system responds with a laundry list of results. The first five hits are primarily research documents. The second page of the results has links to two IBM text mining systems, IBM TAKMI and IBM Text Mining Server. TAKMI is another research link and the Text Mining Server is on the IBM developer Web site.

I don’t know about you, but I received one hit for PC704 and and quite a few research hits for text mining. Where is the product information?

Let us persist. I know that IBM had a product called WebFountain. I want information about that product. I enter the single word, WebFountain, and the IBM system responds with 152 results. The documentation links figure prominently as well as pointers to information about a WebFountain appliance and architecture for a large-scale text analytics system.

Result 13 seemed to be on target. Here is what the Planetwide system showed me:

IBM – WebFountain – United States
WebFountain is a new text analytics technology from IBM’s Research division that analyzes millions of pages of data weekly.
URL: http://www-304.ibm.com/jct03004c/businesscenter/vent…

And here is the Web page this link displays.

webfountain result

Stepping Back

What have these three queries revealed?

  1. Despite the cratering of prices for storage devices, IBM does not maintain an archive of information about its older systems. The single hit for the string PC704 was to a book about storage. The string PC704 probably appears in this technical manual, but the system’s precision and recall disappointed me.
  2. The second query for text mining generated more than 3,000 hits. My inspection of the results suggested to me that IBM was indexing technical information. Some of the documents appeared to be as old as the PC704 that was not available in the index. The results provided no context for the bound phrase, and the results were to me delivering unsatisfactory precision. Recall was better than the single hit for PC704 however. To me, irrelevant hits are not much better than one hit.
  3. The third query for an IBM product called WebFountain generated hits to research reports, documentation, and a Web site about WebFountain. Unfortunately, the link was active but there were no data displayed on the Web page.

All in all, IBM’s Planetwide search is pretty lousy for me. Your mileage may vary, of course.

Read more

Management Views Search as a Side Issue

June 25, 2008

Dave Valiante’s “Enterprise Search a High Priority for Most Users, But Not for Companies” is an important essay. You can read it on the Wall Street Technology Web site here. The url is a tricky one: www.wallstreetandtech.com.

He reports on a study that says “many businesses [are] unaware of the importance of findability. His write up contains a number of interesting statistics from the report based on a survey of 500 business users. The AIIM study triggered a flurry of news items about user dissatisfaction with search, but Mr. Valiante’s essay digs a bit deeper into the results.

The one finding that jumped out at me was:

The survey states that most organizations do not have a strategic approach for enterprise search and shows that 49 percent of respondents have “no formal goal” for enterprise findability within their own organizations.

What a remarkable finding. With search an essential first step in performing work today, the idea that organizations have “no formal goal” is intriguing. Let’s assume that the finding is spot on. Half of the organizations surveyed view search and retrieval as a non-issue. If true, this explains why point solutions for customer support, litigation support, and business intelligence sell throughout an organization. Licensees are neither interested in systems already installed or, even more likely, indifferent to getting a system that meets very specific needs. Silos are not aberrations. Isolated systems, often containing content already processed by another system in the organization, are standard operating procedure.

No wonder an organization’s information technology department often shows little enthusiasm for a search or content processing system. With systems flowering, existing technical resources may be stretched to the limit. Another related thought I had, again assuming the finding is accurate, is that vendors have little incentive to change their marketing and sales strategies. A vendor can jump from market sector to market sector looking for customers who have a specific problem.

My research reveals user dissatisfaction with search and retrieval. The information in Mr. Valiante’s write up tells me that dissatisfaction is likely to be the norm in many organizations until management understanding matures. Agree? Disagree? Use the comment sections to share your views.

Stephen Arnold, June 25, 3008

SharePoint and Lotus Notes: Deeper and Wider Challenges

June 24, 2008

Oliver Marks’s “Microsoft Office SharePoint Server: A Next Generation of Deeper, Wider Content Silos?” stopped me in my tracks. You may want to read the complete essay here. Mr. Marks has done a nice piece of work. The seed from which this analysis germinated was a discussion at the Enterprise 2.0 conference during which Microsoft and IBM each demonstrated their respective products, SharePoint and Lotus Notes.

I have some lame duck experience with both systems, and I have to admit, I am not exactly sure what product category is appropriate for either product. Mr. Marks’s nailed the issue squarely in two of his observations.

First, with regard to IBM and Lotus Notes, he writes:

…It’s not too hard to see where that supertanker is sailing: over time enterprises whose backbone is Lotus Notes will eventually upgrade to Lotus Connections to take advantage of adequate collaboration capabilities.

Second, he observes:

The road ahead for SharePoint users is less clear. The partners and front end providers for Microsoft Office SharePoint Server (MOSS), which is built on top of Windows SharePoint Services (WSS) continue to build, with some excellent contextual products signing on…Partners are seeing an opportunity to create a view into otherwise impenetrable SharePoint silos.

I agree with his assessment that Microsoft has a number of “disparate products” and uniting them will be interesting to watch.

As I thought about his metaphors “deeper, wider content silos”, several thoughts swirled through my mind.

First, SharePoint and Lotus Notes are what I think of as software that can be dressed in a costume to assume a large number of guises. In the US government, Lotus Notes means email. True,there are “spaces” for shared documents, calendars, and collaboration tools. But email is the fuel that powers many of the agencies with which I am familiar. Lotus Notes is defined by its users. SharePoint can be a document manager, or as one consultant told me “a next-generation operating system for the enterprise”. I think the fuzzy boundaries are a clear indication that both IBM and Microsoft want a class of software that can be sold anywhere, anytime, to do anything. Fuzzy makes it difficult for competitors to pin down exactly what feature set is appropriate for a particular organization. It is like playing cards against a person who can change cards at will.

Second, both SharePoint and Notes create repositories and data stores that can be difficult to normalize, index, deduplicate, and index so users can find a specific document. I recall a situation in one company where a needed attachment was shared in a workspace. In this particular organization, the originator of the document worked in the unit that was anchored in Notes. Several colleagues were from a group relying on Microsoft Exchange. In the span of seven days between virtual meetings in a shared space, the attachment was copied, modified, emailed, and transferred across and within each of the environments. A query for the document produced an unusable list of “hits”. The only way to find the particular version needed by the group was to inspect each instance manually. Mr. Marks’ “deeper and wider” allusion evoked in my mind a flood of murky brown water in Cedar Rapids. What a problem for residents and what a mess the flood creates.

Finally, the problem of managing information within and across boundaries of polymorphic software systems like SharePoint and Lotus Notes is growing. Most users of these systems do the best they can to create documents and share them. The flood of digital information combined with users’ willingness to distribute copies forwarded hither and yon, make changes to attachments, and create local stores with unique file names is the reality in most organizations. The management tools provided with SharePoint and Lotus Notes have not kept pace with the data management challenge.

What’s my take? I think that both of these systems create time bombs for system administrators; specifically:

  • The cost of figuring out what is in these systems and then deducing like Sherlock Holmes what is what is hidden but sucking scarce resources
  • The administrative tasks necessary to index and make findable information in these systems is getting larger and more time consuming by the day. One question that concerns me is, “How do I know I have located each relevant document for this particular matter?” I just do not know what I have missed.
  • The legal vulnerability of organizations with these systems is ratcheting upwards. Email is a challenge. Who has seen what? Where is a particular document? What is the lineage or family tree of a document with an important change?

Agree that SharePoint and Lotus Notes are great as they are? Let me know. Do you believe that these polymorphic systems have some rough edges? Share your viewpoints in the comments section to this Web log.

Stephen Arnold, June 24, 2008

ProQuest Dialog: An Optimistic View

June 24, 2008

I talked with a colleague who described to me Outsell’s view of the ProQuest Dialog deal. I have not seen the report, but you can order a copy here. Note that the research company has as its url outsellinc.com. The url outsell.inc is an automotive company with the same name.

As I understood my colleague, Outsell sees a positive gain. Libraries–Dialog’s revenue bulwark–will win. ProQuest is a company focused on library information. The buyers got a good deal, paying considerably less than Dialog’s previous owners paid. My recollection is that with each sale of Dialog, the seller lost money. From a high of $400 million, this deal rings in well south of that, probably a fire sale price. Third, users benefit because ProQuest is able to leverage the terabytes of abstracted, bibliographic, full text, and semi-structured data on the Dialog computers.

dialog sample record

This is a snippet of a Dialog Blue Sheet. It makes clear the details about a database. Can you figure this out? If you can, then you can pay as much as $100 per query to access these types of data. The market for this information is inelastic; that is, you can raise the price, but you will not be able to boost revenues. A few people will pay anything to get these data. Most people will go elsewhere.

The Optimist’s View

I can accept a positive spin on the deal. However, there may be some factors that the financial wizards with the sharp pencils may not be able to control:

  1. Libraries are strapped for cash. The life blood of information companies who sell to libraries is the standing order. Under budget pressure, each year standing orders get closer scrutiny. In head to head competitions, clever pricing deals can win the one year deal. The problem is that there is no renewal and thus no revenue base. With only a few big players selling online information, it may be tough to pump up revenues.
  2. Information users like the New South Wales’ students who will be exposed to Google may find the Dialog-style online information as archaic as my son did when I introduced him to online research in lieu of the Readers Guide to Periodical Literature in 1982. Dialog is simply not in tune with the bloggy, real timey, and Webby world of online.
  3. ProQuest and its parent have never been at the cutting edge of technology. Online today has to deal with scaling, commodity hardware, and fast cycle programming. Perhaps ProQuest’s technologists are as good as the engineers at Google? My thought is that ProQuest may be stretched to the limit dealing with an online system with its roots in the late 1960s. The cost to get modern may be beyond the reach of the new owners.

Read more

Google Expands Footprint in Australian School District

June 24, 2008

Six months ago, a Microsoft search wizard told me, “Google’s education sales are not significant.” I disagreed, but present and former Redmond wizards are always right or at least think they are right. Google’s efforts to get students and academic institutions to use the company’s cloud-based services like Gmail is part of the GOOG’s strategy to penetrate the enterprise. But it is a longer-term strategy because Google is willing to get student comfy with its products and services, let them get their jobs, and then pull Google along with them.

Australian IT reported on June 24, 2008, that Google landed a major school district with the help of its partner, SMS Management and Technology. You can read the full story written by Andrew Colley here. The New South Wales school district has 1.5 million students, who will soon be Googling. When the system is deployed, it will be one of the largest deployments of Gmail in the world.The deal is worth $9.5 million over three years.

The real pay off, to my way of thinking, will be the students who graduate with Google as part of their thought processes. The Microsoft wizard who told me Google’s education strategy was not worth his time may want to consider what happens if Google succeeds in winning other school districts to embrace the GOOG.

Stephen Arnold, June 24, 2008

As Google and Salesforce Close Dance This Google Interview Gains Importance

June 24, 2008

In my Google files is an interview with Dave Girouard, the top Googler for the firm’s enterprise division. Mr. Girouard spoke with John Zyskowski of Federal Computer Week in February 2008. The give and take ran under the title “Google’s Dave Girouard on Google-ization.” You can read the story here. When I first read it, I did not see much that resonated with my research.

Now, as Google and Salesforce.com shift from casual dancing partners to going steady, Mr. Girouard’s remarks have a new significance to me. Let me highlight three points that caught my attention against my deeper understanding of what Google and Salesforce.com are offering organizations.

First, the search appliance is not the end game. Other applications have pulled Google deeper into the enterprise. The thought that crossed my mind is that Google’s search appliances connect to other Google enterprise applications. At some point these appliances could be used to create a different type of infrastructure that unites the organization, Google, and the appliance.

Second, Google’s unique selling proposition has two points: cost and capability. As the economy weakens, Google and Salesforce.com are poised to offer increasingly diverse applications that sell themselves. Customers come seeking Google and Salesforce.com. If that model continues to work, incumbent on-premises application vendors will face higher marketing costs and customers who go looking for an alternative.

Third, Mr. Girouard suggests that more cloud-based products and services are coming from Google.

Several questions came to my mind:

  1. Is Google likely to move from close dancing to a more intimate relationship with Salesforce.com? The tie up would make sense and help pave the way for Google to make a play for a larger share of the multi-billion dollar enterprise systems market.
  2. Has Google decided that Salesforce.com’s multi tenant approach to applications has sufficient technical merit to complement Google’s own software and systems? For a period of time, I thought that Google perceived the multi tenant inventions of Salesforce.com as trivial. Maybe I was wrong and multi tenant technology is indeed a big deal for Google.
  3. Will Salesforce.com’s existing customer base be candidates for Google’s data management services? My reading suggests to me that Google has more data base horsepower than it talks about.

With data management looming as one of the major challenges for organizations, Salesforce.com might be a useful stalking horse. Read the interview and let me know your thoughts.

I tried to get Google to participate in my Search Wizards Speak series, but like Autonomy and Microsoft, my request fell on deaf ears. Could there be some reluctance to let me probe into such matters as data management? I can only formulate my opinions based on chunks of information such as this Federal Computer Week interview.

Stephen Arnold, June 24, 2008

Megaputer: An Emerging Force in Data and Text Analysis

June 23, 2008

Megaputer, based in Bloomington, Indiana, continues to expand the capabilities of its data and text analysis system. The next release, said Sergei Ananyan, one of the company’s founders, will a 64-bit version, browser-based reporting, and support for text analysis in multiple languages.

Dr. Ananyan, a Ph.D. in nuclear physics, spoke to ArnoldIT.com and said:

Megaputer keeps developing PolyAnalyst as a powerful and flexible analytic platform, but our real strength derives from the ability to build push-button custom solutions for handling typical tasks in various application domains.

Megaputer was founded in 1994, which makes the company one of the more mature in the data and text analysis fields. The company has landed a number of blue-chip customers in law enforcement, pharmaceuticals, and financial services.

As organizations realize that individual users and work units require customized content processing systems, Megaputer’s approach has been attracting attention. Megaputer can deploy its range of analytic tools to meet the needs of different users without having to do the manual coding and hands-on rework that plague many of the firm’s competitors.

The company, however, is anchored in mathematics, quite advanced algorithms. Dr. Ananyan says:

We value math, and I suppose we share that technical foundation with Google. So, okay, we are good at math just like Google but with one difference. I think we are specialists in the type of math necessary to make Megaputer solve our clients’ problems.

The key to success, says Dr. Ananyan:

While providing users of PolyAnalyst with lots of functionality, we try to lower the learning curve for new users. We spend lots of thought and effort on keeping PolyAnal6yst as simple in use as possible. We make every effort to simplify the user experience with the system. The user builds a data analysis scenario through an intuitive drag-and-drop interface. The developed scenario is represented as a graphical flow chart with editable nodes and can be shared for collaboration or scheduled as a task for future execution. The results of any analytical step can be saved in an easy-to-comprehend and visually appealing report the user generates on the fly.

Megaputer has several advantages compared to some vendors who provide a specific text processing function:

  • The company’s technology suite is broad and deep, supporting on-the-fly categorization, ease-of-use, and versions for single user and on premises enterprise installations
  • The strong foundation in mathematics does not get in the way of the users due to careful design of the system interfaces
  • The inclusion of data cleansing, federation, and visualization functions allows the system to meet a range of needs without forcing licensees to seek add-ins or third-party utilities.

You can learn more about Megaputer here. The full text of the interview with Dr. Ananyan appears on ArnoldIT.com here as part of the Search Wizards Speak series.

Stephen Arnold, June 23, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta