Microsoft’s SharePoint in a Post Chrome World

September 17, 2008

CNet ran an interesting story on September 9, 2008 with the fetching title “Microsoft’s Response to Chrome. SharePoint.” The author was Matt Asay, a fellow whose viewpoint I enjoy. For me, the key point to this article which you can read here was:

Microsoft, then, has not been sitting still, waiting to be run over by Google. It has been quietly spreading SharePoint throughout enterprises. SharePoint opens up enterprise data to Microsoft services, running in Microsoft’s browser. Unlike Google, however, Microsoft already has an impressive beachhead in the enterprise. It’s called Office, and most enterprises are addicted to it. In sum, if Google is aiming for Windows, it’s going to lose, because the table stakes are much higher. For Microsoft, the game is SharePoint. For the rest of the industry, including Google, the response needs to be content standardization.

The battle between Google and Microsoft pivots on content. SharePoint is Microsoft’s content standardization play. I think this argument is interesting, but a handful of modest issues nagged at me when I read the article:

  1. SharePoint is a complicated collection of “stuff”. You can check out the SharePoint placemat here. Complexity may be the major weakness of SharePoint.
  2. SharePoint search is a work in progress. If you have lots of content even if it is standardized, I find the native SharePoint search function pretty awful. I find it even more awful when I have to configure it, chase down aberrant security settings, and mud wrestle SQL Server performance. I think this is an iceberg issue for Microsoft. The marketing shows the top; the tech folks see what’s hidden. It’s not pretty.
  3. Google’s approach to content standardization is different from the SharePoint approach Mr. Asay describes. The GOOG wants software to transform and manipulate content. The organization can do what it wants to create information. Googzilla can handle it, make it searchable, and even repurpose it with one of its “publishing” inventions disclosed in patent documents.

I hear Mr. Asay. I just don’t think SharePoint is the “shields up” that Microsoft needs to deal with Google in the enterprise. Agree? Disagree? Help me learn, please.

Stephen Arnold, September 10, 2008

Google Solves One Asia Pacific Telco Problem

September 17, 2008

In early 2008, one of the firms with whom I work set up a series of Google telco briefings. These were quite interesting for me, but I think the telco executives were baffled by Google’s long history of telco-related inventions. The company nailed a quality of service invention a year after opening its doors. Yes, telco has been on Google’s very big brain for almost a decade, maybe longer.

A story, largely ignored by the trade journals, appeared on TelecomAsia.net that reported Google’s progress on what one telco executive told me was, and I am quoting from memory, “An almost impossible problem for the best minds in the telephone industry and almost certainly beyond Google’s capabilities.”

Well, the telco executive–not surprisingly–seems to have be incorrect if the TelecomAsia.net story is accurate. You can read “Google-Backed LEOsat IP Backhaul Project Is Go” by John C. Tanner by clicking this link. I verified this link at 10 pm Eastern on September 9, 2008, but some of these news sites roll off their content in order to protect their interests. (Your interests, dear reader, don’t count.)

The telco double talk is tough to penetrate. Let me simplify. Google is getting in the telco business in Asia. You can dig through the details that Mr. Tanner does an excellent job presenting.

Let me offer several comments;

  1. Google doesn’t seem to be particularly concerned about getting in the high speed connect business in the Asia Pacific region.
  2. The “problems” appear to be solved. Just as Google “owns” its own high resolution satellite for geospatial imagery, Google owns its own undersea cables.
  3. Telco assumptions about Google remain shallow.

My question is, “Who’s going to regulate Google outside the US and across the region empowered by the GOOG’s new backhaul initiative?” Any ideas? The World Court? The UN? There are two stellar outfits ideally positioned to understand the whys and wherefores of Googzilla.

Stephen Arnold, September 10, 2008

How Smart Is Google’s Software?

September 17, 2008

When you read this, I will have completed my “Meet the Guru” session in Utrecht for Eric Hartmann. More information is here. My “guru” talk is not worthy of its name. What I want to discuss is the relationship between two components of Google’s online infrastructure. This venue will mark the first public reference to a topic I have been tracking and researching for several years–computational intelligence. Some background information appears in the Ignorance Is Futile Web log here.

I am going to reference my analysis of Google’s innovation method. I described this in my 2007 study The Google Legacy, and I want to mention one Google patent document; specifically, US20070198481, which is about fact extraction. I chose this particular document because it references research that began a couple of years before the filing and the 2007 granting of the patent. It’s important in my opinion because it reveals some information about Google’s intelligent agents, which Google references as “janitors” in the patent application. Another reason I want to highlight it is that it includes a representation of a Google results list as a report or dossier.

Each time I show a screen shot of the dossier, any Googlers in the audience tell me that I have Photoshopped the Google image, revealing their ignorance of Google’s public patent documents and the lousy graphical representations that Google routinely places in its patent filings. The quality of the images and the cute language like “janitors” are intended to make it difficult to figure out what Google engineers are doing in the Google cubicles. Any Googlers curious about this image (reproduced below) should look at Google’s own public documents before accusing me of spoofing Googzilla. This now happens frequently enough to annoy me, so, Googlers, prove you are the world’s smartest people by reading your own patent documents. That’s what I do to find revealing glimpses such as this one display for a search of the bound phrase “Michael Jackson”:

image

The highlight boxes and call outs are mine. What this diagram shows is a field (structured) report or dossier about Michael Jackson. The red vertical box identifies the field names of the data and the blue rectangle points your attention to the various names by which Michael Jackson is known; for example, Wacko Jacko.

Now this is a result that most people have never seen. Googlers react to this in shock and disbelief because only a handful of Google’s more than 19,000 employees have substantive data about what the firm’s top scientists are doing at their jobs. I’ve learned that 18,500 Googlers “run the game plan”, a Google phrase that means “Do what MOMA tells you”. Google patent documents are important because Google has hundreds of US patent applications and patents, not thousands like IBM and Microsoft. Consequently, there is intent behind funding research, paying attorneys, and dealing with the chaotic baloney that is the specialty of the USPTO.

Read more

STR: More and Better Self Service Business Intelligence

September 16, 2008

When I was at university, the advanced statistics course meant learning SAS. I remember my feeling when I finished the course. I had been beaten into a “SAS person.” Today, some university graduates don’t want to wrestle with statistics again. Most people, in my opinion, forget the chi squared test of homogeneity after the final exam.

Recognizing that organizations need access to crunched data in a meaningful form, Space Time Research has labored to create self service business intelligence. The company’s strategy seems to be working, and I know that the number herders recognize that a challenge to the SPSS and SAS approach is building.

Space-Time Research is one of the global leader in self-Service business intelligence for government. The company–based in Australia–has offices in the US and the UK. The STR SuperSTAR Platform is an end-to-end solution providing self-service analytics and business intelligence, interactive web publishing, privacy and confidentiality protection, mapping and visualization. The company has released a new version of its SuperSTAR Platform. The release includes a Data Control Application Programming Interface (API) that provides a ‘plug and play’ approach to privacy and confidentiality mechanisms. The API allows use of STR integrated techniques, accepted protection and confidentiality products, or custom confidentiality rules. These techniques and rules are applied to ad-hoc queries on unit record and aggregate data when a request for information is processed. You can read ARnet.com’s take on the new version here. The MarketWatch write up is here.

You can download a two page brochure that provides more information about the self service interface. Click here. You may have to register to get the download to work. Take a look. Cloud-based business intelligence is going to gain importance. More information about STR is available at the company’s Web site here.

Stephen Arnold, September 16, 2008

Google’s Sky Darkens with Wings of Legal Eagles

September 16, 2008

The European Union has shifted its laser beams of investigation to Google. Microsoft must be chortling with the news. You have many ways to get the inside scoop on this inquiry. I liked Silicon.com’s summation. You can read “EU to Probe Yahoo!-Google Advertising Tie Up” here. For me the most important point in the story was this comment:

The Commission spokesman said there was no deadline for the investigation in Brussels.

I don’t know much about law in general and EU inquiries in general, but what I saw the words “no deadline”, I thought, “Yikes, probers can poke around for months, even years.” When governmental agencies gear up for a “no deadline” inquiry, the likelihood of uncovering mountains of information that can be interpreted in many different ways becomes a certainty.

My thinking is that Google might conclude the Yahoo deal is too much hassle and walk away. Let’s assume this happens.

First, I think the EU will keep its lasers on Mr. Google. The group working to gather information won’t go gently into that good night. At this point, I think the EU will keep on probing and sifting no matter what Google does with regard to Yahoo. Committees can find many interesting issues to weigh and then measure against applicable guidelines, regulations, and laws. Therefore, it’s open season on Mr. Google for the foreseeable future.

Second, if Google leaves Yahoo at the alter, what will Yahoo do? It’s bold play to get hackers to generate revenue appeals to my teen age self. But the 65 year old side of that self thinks, “Yahoo may be pushed off the cliff and into the clutches of gravity.” The “gravity” to which I refer is the pre crash 2000 notion of “zero gravity” Web companies. I think Isaac Newton and his mythical apple remind me of what may happen to Yahoo unless a fairy godmother rescues the company. Yahoo costs are tough to control and the loss of Google revenue may be too much for the Yahooligans to bear.

I see the EU investigation as a turning point for Google and possibly for Yahoo. What do you think? Mr. Google wows Brussels. Yahoo surges when cut free. Let me know because I see Google’s sunny day occluded by the wings of legal eagles.

Stephen Arnold, September 16, 2008

Chrome: Full Metal Jacket Ecosystem

September 16, 2008

Economic Times, September 12, 2008, reported that Google’s Sergey Brin sees Chrome as a challenger to Microsoft Windows. The story “It’s Not Just IE, Google Is Eyeing Windows’ Desktop Pie Too” by Stephen Wildstrom (Business Week) is here. For me the key part of the article is that it puffs up a beta browser into a big bazooka. The article chastises Google for a flawed initial effort. I agree. But the most important statement attributed to Sergey Brin, one of Google’s founders, was:

“What we want is a diverse and vibrant ecosystem…We want several browsers that are viable and substantial choices.”

Let’s take this at face value. Why will the existence of multiple browsers help Google achieve its objective?

  1. What’s the rush? Internet Explorer and Firefox have market share. Google is sufficiently realistic about the speed of migration from one browser to another. Google is taking a long view.
  2. The Google browser is not a browser. I know this is a different position from the millions of words written about Chrome. My research suggests that Chrome is a way for Google to bring control to certain applications and operations; namely, an icon on the desktop that launches a cloud based service. To the user, there’s no browser present.
  3. Google’s patent documents for the Programmable Search Engine suggest that Google will build its own data stores from bits and pieces of existing data. If this is an accurate reading of the PSE February 2007 patent applications, Google wants to become the semantic Web and probably “the Internet”. Chrome is a puzzle piece, not the solution to the puzzle.
  4. Chrome adds steroids to some 95 pound weakling issues with Google’s current enterprise offerings. Think “air lock” between the organization and the Google cloud.

Agree? Disagree? Send me your facts.

Stephen Arnold, September 16, 2008

Infobright: Sun Sees the Light

September 16, 2008

I wrote about Infobright in May 2008. Predictably mainstream trade publications and most technical Web logs ignored the story. The idea of figuring out rough sets and applying their mathematics to data storage is less exciting than writing about Google and Microsoft. You can read about the Warsaw connection here.

Today news reached me in the Netherlands that Sun Microsystems has pumped $10 million into Infobright. I also learned from Network World’s Chris Kanaracus that

Infobright is adding the open-source Community Edition to its existing enterprise offering. The latter product provides features such as faster data-loading, support for text and binary loading, a product warranty and indemnification.

Data management is emerging as a top concern. Forget search. Unless you can corral the proliferating digital information, finding information is quite difficult. Why is this important? Sun seems to be reacting to Google’s increasing scope. I anticipate more Sun moves that will help the company respond to Google’s increasing appetite for the enterprise.

Stephen Arnold, September 16, 2008

Yahoo Open: Why the Odds Don’t Favor Yahoo

September 16, 2008

When we started The Point (Top 5% of the Internet) in 1993, our challenge was Yahoo. I recall my partner Chris Kitze telling me that the Yahoo vision was to provide a directory for the Internet. Yahoo did that. We sold The Point to Lycos and moved on. So did Yahoo. Yahoo become the first ad-supported version of America Online. The company also embarked on a series of acquisitions that permitted each unit to exist as a tiny fiefdom within the larger “directory” and emerging “ad-supported” AOL business. In the rush to portals and advertising, Yahoo ignored search and thus began its method of buying (Inktomi), licensing (InQuira), or getting with a buy out (Flickr) different search engines. Google was inspired by the Overture ad engine. Yahoo surveyed its heterogeneous collection of services, technologies, and systems and ended up the company it is today–an organization looking to throw a Hail Mary pass for the game winning touchdown. That strategy won’t work. Yahoo has to move beyond its Yahooligan approach to management, technology, and development.

image

The ArnoldIT.com and Beyond Search teams have had many conversations about Yahoo in the last year. Let me summarize the points that keep a lid on our enthusiasm for Yahoo and its present trajectory:

  1. Code fiddling. Yahoo squandered an opportunity to make the Delicious bookmarking service the dominant player in this segment because Yahoo’s engineers insisted on rewriting Delicious. Why fiddle? Our analysis suggests that Yahoo’s engineers don’t know how to take a hot property, scale it, and go for the jugular in the market. The approach is akin to recopying an accounting worksheet by hand because it is just better when the worksheet is perfect. Wrong.
  2. Street peddler pushcart. Yahoo never set up a method to integrate tightly each acquisition the company made. I recall a comment from a person involved in GeoCities years ago. The comment was, “Yahoo just let us do out own thing.” Again this is not a recipe for cost efficiency. Here’s why: The Overture system when acquired ran on Solaris with some home grown Linux security. Yahoo bought other properties that were anchored in MySQL. Then Yahoo engineers cooked up their own methods for tests like Mindset. When a problem arose, experts were in submarines and could not really help with other issues. Without a homogeneous engineering vision, staff were not interchangeable and costs remain tough to control. The situation is the same when my mother bought a gizmo from the street peddler in Campinas, Brazil. She got a deal, but the peddler did not have a clue about what the gizmo did, how it worked, or how to fix it. That’s Yahoo’s challenge today.
  3. Cube warfare. Here’s the situation that, according to my research, forced Terry Semel to set up a sandwich management system. One piece of bread was the set of technical professionals at Yahoo. The other piece of bread was Yahoo top management. Top management did not understand what the technical professionals said, and when technical professionals groused about other silos at Yahoo, Mr. Semel put a layer of MBAs between engineers and top management to sort out the messages. It did not work, and Yahoo continues to suffer from spats across, within, and among the technical units of the company. It took Yahoo years to resolve owning both Flickr and Yahoo Photos. I still can’t figure out which email system is which. I can’t find some Yahoo services. Shopping search is broken for me. An engineer here bought a Yahoo Music subscription service for his MP3 player. Didn’t work from day one, and not a single person from Yahoo lifted a finger, not even the one tracked down via IRC. I created some bookmarks and now have zero idea what the service was or where the marks are located. It took me a year to cancel the billing for a Yahoo music service a client paid me to test. (I think it was Yahoo Launch. Or Yahoo Radio. Or Yahoo Broadcast. Hard to keep ’em straight.) Why? No one cooperates. Google and Microsoft aren’t perfect. But compared to Yahoo, both outfits get passing grades. Yahoo gets to repeat a semester.

When I read the cheerleading for Google in CNet here or on the LA Times’s Web log here, I ask, “What’s the problem with nailing Yahoo on its deeper challenges?” I think it’s time for Yahoo to skip the cosmetics and grand standing. With the stock depressed, Yahoo could face a Waterloo if its Google deal goes south. Microsoft seems at this time to be indifferent to the plight of the Yahooligans. Google is cruising along with no significant challenge except a roadblock built of attorneys stacked like cord wood.

Yahoo is a consumer service. The quicker its thinks in terms of consumerizing its technology to get its costs in line with a consumer operation the better. I’m not sure 300 developers can do much for the corrosive effects of bad management and a questionable technical strategy. Maybe I’m wrong? Maybe not? We sold The Point in 1995 and moved on with our lives. Yahoo, in my opinion, still smacks of the Internet circa 1995, not 2008 and beyond.

Stephen Arnold, September 16, 2008

Extending SharePoint Search

September 15, 2008

Microsoft SharePoint is a widely used content management and collaboration system that ships with a workable search system, which I’ll refer to as ESS, for Enterprise Search System. But for program expansion and customization, you’ll want to look to third-party systems for help.

Sharepoint has reduced the time and complexity of customizing result pages, handling content on Microsoft Exchange servers, and accessing most standard file types. In our tests of SharePoint, ESS does a good job and offers some bells and whistles like identifying the individual whose content suggests an author is knowledgeable about a specific topic. Managing crawls or standard index cycles are point and click, SharePoint is security aware, and customization is easy. But licensees will hit a “glass ceiling” when indexing upwards of 30 million documents. To provide a solution, Microsoft purchased Fast Search & Transfer. Microsoft has released a Fast Search Web part to make integration of the FAST Enterprise Search Platform or ESP easier. The SharePoint FAST ESP Web part is located Microsoft’s CodePlex web site and the documentation can be obtained here.

But licensing Fast ESP can easily soar above $250,000, excluding customizing and integrating service fees making it a major investment to deliver acceptable search-and-retrieval functionality for large, disparate document collections. So what can a SharePoint licensee do for less money?

The good news is that there are numerous solutions available. These range from open source options such as Lucene and FLAX to the industrial-strength Autonomy IDOL (intelligent data operating layer), which can cost $300,000 or more before support and maintenance fees are tacked on.

Third-party systems can reduce the time required to index new and changed documents. One of the major reasons for shifting from the ESS to a third-party system is a need to provide certain features for your users. Among the most-requested functions are deduplication of result sets, parametric searching/browsing, entity extraction and on-the-fly classification, and options for merging different types of content in the SharePoint environment. The good news is that there are more than 300 vendors with enterprise search systems that to a greater or lesser degree support SharePoint. The bad news is that you have to select a system.

Switching Methodology

Each IT professional with Microsoft certification knows how to set up, configure, and maintain SharePoint and other “core” Microsoft server systems. Let’s look at a methodology for replacing SharePoint with ISYS Search Software’s ISYS:web. ISYS is one of a half-dozen vendors offering so-called “SharePoint Search” capabilities.

Here’s a run down of a procedure that minimizes pitfalls:

  1. Set up a development server with SharePoint running. You don’t need to activate the search services. This can be on a computer running Windows Server 2003 or 2008. Microsoft recommends at a minimum a server with dual CPUs, each running at least 3 GHz, and 2 GB of memory. Also necessary for installation are Internet Information Services (IIS, along with its WWW, SMTP, and Common Files components), version 3.0 or greater of the .NET Framework, and ASP.NET 2.0. A more detailed look at these requirements can be found here.
  2. Create a single machine with several folders containing documents and content representative of what you will be indexing.
  3. Install ISYS:web 8 on the machine running SharePoint.
  4. Work through the configuration screens, noting the information required to add additional content repositories to index. An intuitive ISYS Utilities program will let you configure SharePoint indexes.
  5. Launch the ISYS indexing component. Note the time indexing begins and ends. You will need these data in order to determine the index build time when you bring the system up for production.
  6. Run test queries on the indexed content. If the results are not what you expect, make a return visit to the ISYS set up screens, verify your choices, delete the index, and reindex the content collection. Be sure to check that entities are appearing in the ISYS display.
  7. Open the ISYS results template so you can familiarize yourself with the style sheet and the behind-display controls.
  8. Once you are satisfied that the basics are working, verify that ISYS is using security flags from Active Directory.

At this point, you can install ISYS on the production server and begin the processing of generating the master index. Image files for the ISYS installation are available from ISYS. These include screen shots illustrating how to set up the ISYS index.

Some Gotchas to Avoid

First, when documents change, the search system must recognize that change, copy or crawl the document, and make the changed document available to the indexing subsystem. The new index entries must be added to the main index. When a slow down occurs, check the resources available.

Second, keep in mind that new documents must be indexed and changed documents have to be reindexed. Setting the index update at too aggressive a level can slow down query processing. Clustering can speed up search systems, but you will need to allocate additional time to configure and optimize the systems.

Third, additional text processing features such as deduplication, entity extraction, clustering, and generating suggestions or See Also hints for users suck computing resources. Fancy extras can contribute to sluggish performance. Finally, trim the graphical bells and whistles. Eye candy can get in the way of a user’s getting the information required quickly.

To sum up, SharePoint ships with a usable search-and-retrieval system. When you want to break through the current document barrier or add features quickly, you will want to consider a third-party solution. Regardless of the system you select, set up a development server and run shake downs to make user the system will deliver the results the users need.

Stephen Arnold, September 15, 2008

Google and ProQuest

September 15, 2008

The Library Journal story “ProQuest and Google Strike Newspaper Digitization Deal” puts a “chrome” finish on a David and Goliath story. Oh, maybe that is ProQuest and Googzilla? In the story my mother told me, David used a sling to foil to big, dumb Goliath. With some physics, Goliath ended up dead. You need to read Josh Hadro’s version of this tale here.

The angle is that Google will pay UMI–er, ProQuest–to digitize. For me the most important paragraph in the story was:

The deal leaves significant room for ProQuest to differentiate its Historical Newspapers offering, which contain such publications as the New York Times and Chicago Tribune, as a premium product in terms of added editorial effort and the human intervention required to make its selectively scanned materials more discoverable and useful to expert researchers. In contrast to scanning by Google, editors hired by ProQuest check headlines, first paragraphs, captions, and more to achieve their claim of “99.95 percent accuracy.” In addition, metadata is added along with tags describing whether the scanned content is an article, opinion piece, editorial cartoon, etc. Finally, ProQuest stresses that the agreement does not affect long-term preservation plans for the microfilm collection. “Microfilm will always be the preservation medium…”

Three thoughts:

  1. Commercial databases are starting to face rough water. Google, though not problem free, faces rough water with a nuclear powered stealth war craft. UMI–er, ProQuest–has a birch bark canoe.
  2. Once the data are in the maw of the GOOG, what’s the outlook for UMI–er, ProQuest? In my opinion, this is a short term play with the odds in the mid and long term favoring Google.
  3. Will the Cambridge Scientific financial wizards be able to float the Dialog Information Services boat, breathe life into library sales, and make the “microfilm will always be the preservation medium” a categorical affirmative? In my opinion, the GOOG has its snoot in the commercial database business and will disrupt it sending incumbent leaders into a tizzy.

Yes, and the point about David and Goliath. I think Goliath wins this one. Agree? Disagree? Help me learn. Just bring facts to the party.

Stephen Arnold, September 15, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta