SharePoint SDK Updates

April 26, 2009

Andrew Connell informed me here that Microsoft released Microsoft has released “the April 2009 refresh (v1.5) of the downloadable version (CHM files) of the Windows SharePoint Services 3.0 & Office SharePoint Server 2007 SDKs.” You read his write up here. A happy quack to Mr. Connell for the download links as well.

The WSS SDK includes:

  • Expanded documentation of backup and restore features   This release contains greatly expanded documentation of backup and restore features, including a new top-level node, “Backing Up and Restoring.” The node includes twelve articles, including “Overview of Backing Up and Restoring Data in Windows SharePoint Services,” and four new How To topics.
  • Complete documentation of Microsoft.SharePoint.Administration.Backup   Object model reference documentation in the Microsoft.SharePoint.Administration.Backup namespace is complete, and code samples are provided for all critical types and members.
    New documentation of the administrative object model   A new section, “The Administrative Object Model of Windows SharePoint Services 3.0,” contains six new articles, and the “Administration” section has a new, extended code sample.
  • Revised Web Part documentation   The section that provides conceptual documentation of Web Parts has been completely restructured, and two walkthrough topics have been significantly revised and rewritten.
    More migration support   A new section, “Selective Content Migration,” contains three articles to support selective migration strategies. Additionally, additions and revisions have been made to existing topics in the “Content Migration Overview” section, and a large number of API reference topics that support migration and deployment scenarios have been completed in the SharePoint.Deployment namespace.
  • Expanded and updated reference documentation   You can find enhanced documentation of types and members in the SharePoint.Workflow and SharePoint.WorkflowActions namespaces, the People Web service, and three ActiveX controls.

I did not write this golden prose. Credit MSFT here.

Stephen Arnold, April 27, 2009

Wolfram Alpha: Google in No Danger

April 26, 2009

ReadWriteWeb.com’s review of the Wolfram Alpha search system is here. The name Wolfram Alpha includes a pipe symbol, which I am not going to include in this write up. Indexing systems have enough trouble with words. Including a vertical bar is one of those marketing things that annoy me. Booz, Allen & Hamilton in the 1970s replaced the comma between “Booz” and “Allen” with a dot. What a headache. No vertical bar for me, thanks.

The headline on the write up was “Our First Impressions.” In sales and search, first impressions matter. The problem is that each search engine is different, and the methods required to get the results a user wants takes time, experimentation, and often industrial strength testing.

ReadWriteWeb.com’s Frederic Lardinois participated in a Web demo and concluded:

It will definitely not be a Google killer.

He continued:

Alpha is built around a vast repository of curated data from public and licensed sources. Alpha then organizes and computes this knowledge with the help of sophisticated Natural Language Processing algorithms. Users can ask Alpha any kind of question…

Alpha taps into a buzzword that like a summer tornado has been gathering momentum before it hits the trailer park of content that makes up much of the information available on the public Internet. To duck into the storm cellar, Alpha focuses on sources that are “curated”. My impression which is not even based on a demo is that Alpha is in the “deep Web” business. The idea is that there are some useful sources which may be tough to index with a general purpose indexing system like that used by Microsoft Live.com or Yahoo.

When results appear, the system attempts to answer the implicit or explicit question. AskJeeves.com focused on this angle in the early 1990s, but quickly ran aground due to editorial costs and the reluctance users had to ask questions the template system could answer; for example, “What’s the temperature in Chicago?” worked. Questions like “What is the IBM patent for RDBMS?” did not work.

ReadWriteWeb.com pointed out that there will be a free and a for fee version, alerts, and a way to “embed” Alpha into other applications. According to ReadWriteWeb.com, the demo “mostly focused on math and engineering data, so we’ll still have to wait and see how Alpha copes with questions about historical events, for example.”

Let me make several observations:

  • There continues to be a hunger for a system that answers questions. Users don’t ask questions, but those in the research and investment business assume or know that users * should * ask questions. The ArnoldIT.com worked on a couple of question answering systems in the past, and we learned first hand that users want the system to make life easy. The idea search system just displays information the user needs to know at a particular point in time. Today the predictive mobile search systems that hook into a context clue like a geographical position seem to be closer than a search box.
  • Questions pose big problems in ambiguity. There are many ways to serve this fish. Google has its PageRank core wrapped in add on methods to give you a rock star when you type “spears” or “idol”, not Macedonia weapons. The proof of the disambiguation will emerge not from a demo or from impressions but from subjective and objective tests. So the fish remain in the stream at this time.
  • The hope for a Google killer is now a catchphrase. The problem with killing Google is that it has morphed into more than search. The Google platform is going to be tough to break up because it is polymorphic, a characteristic I I discuss along with fluxion in my new study Google: The Digital Gutenberg here. Google has some interesting data management tools that can deliver answers as well.

To wrap up, the ReadWriteWeb.com story is good. The Alpha system seems interesting. The desire to have a kinder and gentler Web search engine is intense. The challenge will be delivering users who understand an answer and who have the types of questions a system can answer without sucking too much money from its investors until the cash begins to flow.

In the meantime, “Cuil” is not longer the word. The word will be “Alpha” for the pundits until the next Google killer walks out of the computer lab into the world of the average Web user.

Stephen Arnold, April 26, 2009

As Print Fades, Online Readership Grows, Asserts Research Guru Nielsen

April 26, 2009

Leo LaPorte, a radio personality and one-man podcast network, popped into my crawler as the author of “Online Audience3 Grows for Newspapers” here. I did some clicking and discovered the attribution here on The Long Tail, a Web log by the author of the book by the same name. Another link aimed me at http://www.techfuga.com. I found a version of the story on MSNBC here, but to my dismay the source was the deeply litigious Associated Press. No Leo LaPorte in sight, but his name drew me into a festival of clicking.

And what was the story?

Oh, people are reading more news online. The ultimate source of the data is the stats giant Nielsen. You can read the data, but my thought was, “If newspapers kill off their print editions or make them too expensive, maybe online is the alternative.”

The problem which the mathematics whizzes at Nielsen don’t address is that the online business models in use by the dead tree crowd don’t pay the bills.

I wonder if Leo LaPorte knows he is the attributed author of what strikes me as a somewhat obvious study? Maybe he linked to the story? Pretty darned confusing provenance. At least the free Web news search systems worked. I could find the MSNBC story easily.

Stephen Arnold, April 26, 2009

Security: Not If There Is Money for Some Humans

April 26, 2009

ITBusiness.ca ran a story with the eye catching headline “One-Third of Employees Willing to Steal Company Data If the Price Is Right” here. Studies of this type require some mental prudence. I found the write up a useful reminder than humans are the weakest link in a security fence. For me the most interesting comment was:

Research by the security event organizer revealed that of those willing to steal sensitive data, 63 per cent would expect at least £1 million (Can$1.78 million) for their troubles, while 10 per cent want enough to pay off their mortgage.

Now, what about that confidential information secured with industry standard systems? Take out your checkbook?

Stephen Arnold, April 26, 2009

Twitter Power Search

April 26, 2009

I have scanned a flurry of Twitter news stories. A Twitter book is finally here. A San Francisco journalist asserts that there will be many Twitters. An outraged netzien wants no more Twitter.

I ignored these stories to look at a service to which one of my readers alerted me. Navigate to http://www.twitterpowersearch.com and explore Tweets organized by Twitter and other services. I spent some time watching the Tweets flow by. Seemed interesting to me.

Stephen Arnold, April 26, 2009

Rating a Search Engine

April 26, 2009

I am in a bit of a quandary. Martin White and I spent about 10 months writing down what we have learned in our combined 60 years of information, search, content processing, and information management. The result was a monograph that summarized in about 120 pages the method for reducing the likelihood of failure when implementing a search system. You can learn more about Successful Enterprise Search Management here.

I received a link to the article “How Not to Rate a Search Engine” here. I enjoy reading these types of how to’s. You may find some of the tips useful. The phrase that caught my attention was, “As one of my colleagues at Powerset always likes [sic] to remind me: this is rocket science.”

I agree.

Stephen Arnold, April 26, 2009

An Howl of Hyperbole Induced Pain

April 26, 2009

PCAdvisor.co.uk is a Web site that complements a consumer computer magazine in the UK. I recall buying a copy and getting a DVD stuffed full of “editor picks” in shareware, code snippets, and articles from back issues. I saw a reference to the article “10 Things We Hate about Technology Now”. You can read the story here. This is the second “we can’t take it any more” write up in the last 24 hours. Dan Sullivan pointed to the public relations blitz that accompanies a new search engine. My take on his article is here. Now the journalists at PCAdvisor.co.uk are showing signs of stress. I can’t recite the entire list of 10 pain points, but I can point to three items and offer a brief comment about each.

First, the magazine objects to the buzz about Twitter. I don’t agree. Twitter is the poster child for real time search. RTS is novel and not well understood. The outrage makes clear to me that no one on the PCAdvisor.co.uk team thinks about the Twitter messages from the point of view of a person involved in police or intelligence work. Twitter is important. A failure to understand is a problem of analytic intelligence, not Twitter. RTS is not likely to go away quickly.

Second, news releases. Last time I checked PCAdvisor.co.uk it seemed to have its share of recycled news releases in its “news” section. Companies generate news spam and publications gobble up the bits and bytes. Story ideas are often hard to get when a publisher pays low wages and rationalizes staff. Recycling is a big part of the profession of computer journalism. Remember. I worked for one of the big guns in the industry.

Third, Apple and Microsoft. I am combining two items because each illustrates a characteristic of news. High profile companies are high profile because people want their products, need information about the companies, or enjoy keeping pace with the buzz about something that has ubiquity. No company is perfect. Publications have to cover the high profile companies in order to attract eyeballs. A niche publication covers more specialized topics and attracts a smaller, more specialized audience. Writing positive or negative articles about high profile companies is a requirement.

I think the marketing bandwagon has been pimped by West Coast Customs. Marketing is now a Hummer, not a Mini Cooper. The change says more about the nature of unchecked capitalism, the desperation some publishers feel when trying to turn red ink into black ink, and journalists who are short on copy ideas.

Stephen Arnold, April 26, 2009

Oracle Sun: The MySQL Mystery

April 25, 2009

I enjoyed John C. Dvorak’s “Only Oracle Really Knows Why It Bought Sun” here. Mr. Dvorak touches on the many mysteries about the sudden, multi billion dollar deal for a company now setting up for a landing at DTA (the dead technology airport).

In New York yesterday (April 23, 2009), I had one of those famous side conversations after my Google and Publishing lecture. In that conversation, three different New Yorkers, each confident of his or her knowledge and superiority over creatures from any place west of Piscataway offered their views about this deal.

Let me summarize each, and then offer a comment. I think these observations provide some additional context for Mr. Dvorak’s “second opinion” with which I agree.

First, Oracle wants to stop IBM. I am not sure what “stop” means because the person who offered this viewpoint was in the financial services business. I interpreted this to mean have an alternative to IBM’s offerings in consulting, hardware, database, etc.

Second, Oracle wants to have a lever over Google. The azure chip consultant who floated this idea drew stares from me and the financial wizard. The consultant suggested that Google uses its MapReduce and but the Googlers rely on MySQL for some database tasks. Google, therefore, is a big fan of MySQL. His source allegedly was a NY Googler. If Oracle has the inside track on MySQL, Oracle has one more thread of connection to Googzilla.

Third, Oracle wants to have a low cost, entry level database to respond to opportunities where price is a consideration. When the data management task “gets serious”, Oracle is ready to provide the industrial strength database solution. This idea came from a professor at one of the upscale colleges in New York City.

Let’s look at each of these ideas.

Stop IBM. Oracle and IBM have been at the game of enterprise data, applications, and consulting for a long time. I don’t think Sun makes a big difference to Oracle if this is the correct analysis. On the other hand, I don’t think that if IBM had purchased Sun that the deal would have been significant for IBM. Once the services deals are migrated, the value of Sun begins to erode in the enterprise market. Oracle gets some new contracts and maybe sells some hardware, but Sun’s business was sloping down, and I don’t think a change in ownership can address that issue. Neither IBM nor Oracle can make dramatic shifts in how each company deals with their enterprise customers. Oracle, like IBM, is not suited to the smaller organization.

Oracle One Ups Google. I don’t think so. Google can create its own baby database. MySQL could be marginalized without too much pain for the Googlers. Oracle may have a viper on its hands with the open source database, so MySQL could become a distraction and provide an enterprise open source vendor with an opportunity marshal a community backlash against Oracle.

Oracle and Low Cost. This is a word pair that does not make sense at Oracle in my opinion. The company needs big revenue opportunities. The notion of a low cost anything at Oracle is a foreign one.

In my view, the New Yorkers who offered these ideas get an A for effort. I just think the views are incorrect. After reading Mr. Dvorak’s “second opinion” piece, I think the deal was a knee jerk reaction. I had heard that a Japanese firm was interested in the Sun property to get customers, technology, and patents. Oracle may find itself handling Sun the way it has approached PeopleSoft and other acquisitions. Business as usual. Make decisions when a decision is needed. In the meantime, operate in an opportunistic manner.

Stephen Arnold, April 24, 2009

Drunk Men’s Web Indexing Analysis

April 25, 2009

A happy quack to the reader who sent me a link to Drunk Men’s Web robot analysis. You can find the article here. The data come from 2005 and 2006 and may not be spot on for 2009. The main point of the write up is that the Google does more with its approach to Web crawling. The payoff from PageRank appears to be a way to get around the need to index certain sites as thoroughly as Yahoo. Microsoft’s Web robot does not appear to be on a par with either Google or Yahoo.

Stephen Arnold, April 25, 2009

Search and Hyperbole: Even the SEO Crowd Is Jaded

April 25, 2009

You must read Dan Sullivan’s “How to Overhype Your Search Engine” here. The title is not in line with the story as I interpreted it, but it includes two hot words: “overhype” and “search engine”. The author explains the basic public relations steps to get coverage of a Web search system. If you want a checklist of what you want Bryce or Buffy to do, follow Mr. Sullivan’s checklist. The second part of the essay tackles “over hyperbole” (is that a bound phrase?) and seems to get into more subjective aspects of search; for example, “stealth”. If a search engine is in stealth, no one should know it is there. Therefore, a “stealth search engine” by definition is a poorly kept secret in my addled goose view. The beef in the essay is broiled for the search engine developed by a real live math guy, Dr. Stephen Wolfram here. Dr. Wolfram fares slightly better than Microsoft, a company that is almost too easy to make a case study for unsuccessful search management.

My take on this essay is the following:

  1. Search hyperbole is now part of the landscape. The claims and assertion that a specific system will revolutionize search or “kill Google” is tiresome. In certain parts of the world, “killing Google” is going to be difficult. In Denmark, for example, more than 90 percent of referral traffic comes from Google based on my examination of a number of high traffic sites Web logs.
  2. Mr. Sullivan notes that Google is an exception. I am not sure that I line up on his side of the gymnasium. Google faces some challenges in China, Korea, and Russia. Each country has a dominant search engine and Google is working to gain traction. So, there are three or four examples of successful Web search systems, not one. A thorough study of the business models and technology of Baidu.com, Naver.com, Yandex.com, and probably some about which I have no knowledge are indeed “out there” and doing reasonably well. Google is an exception, but its approach to search is based on a combination of methods that work reasonably well, but Google’s secret sauce is its platform’s ability to scale at a relatively reasonable cost and handle petabyte flows of data. The search is a combination of what’s popular with some clever math added to season the pudding.
  3. Over the years, one of the principal venues for introducing Web search systems have been search engine optimization conferences. I may be mistaken, but Mr. Sullivan has been involved in the two highest profile SEO conferences, which are in my opinion, platforms for incredible claims and marketing that reminded me of some consumer product trade shows.

Three search engines doing quite well and keep Google at bay.

#baidu

http://www.baidu.com

naver

http://www.naver.com

#yandex

http://www.yandex.ru

My conclusion is that substantive discussion of search and content processing is now quite difficult. Everyone is an expert. Even search systems with clever technology must position themselves as software that does everything. When the SEO guru identifies too much hyperbole as a problem, I am convinced that not only does a problem exist but it is too late to make substantive improvement. In short, hyperbole is more important than what a search system actually does.

Stephen Arnold, April 25, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta