Ads Soften, Google Crumbles: Doomsday Approaches
August 10, 2008
Investor Business Daily’s Pete Barlas reports that online advertising may be crumbling. His story “Survey Indicates Economy Even Taking Toll On Search Ads” relies on data from a Covario survey that supports the assertion “search ad spending last quarter rose by the smallest percentage since at least the start of 2007.” You can read the Investor’s
Business Daily article here. In the best tradition of good news and bad news, Mr. Barlas reviews both sides of the Covario’s research findings. I’m a bad news type, and my mind considered the impact of sharply lower ad spending on Google. Because Google has the lion’s share of the online ad market, any downturn kicks Googzilla in the shin. A big enough downturn, Google could experience periostitis. The problem is not fatal, but it may slow the giant, giving Microsoft and other competitors an opportunity to make headway.
Stephen Arnold, August 10, 2008
Google: More Cause for Doubt
August 10, 2008
SFGate.com offers interesting views of business actions. The article “The AOL Flub Has analysts Revisiting Google” delivers on two counts. First, Ryan Kim summarizes Google’s admission that its investment in America Online has lost value, lots of value. Second, the write up rekindles the ashes of Google’s attempts at diversification have failed. You can read the August 9, 2008, story here. Mr. Kim revisits Google’s scattershot product development and reminds the reader that Google has been distracted by investments in companies such as YouTube.com, which has become a magnet for litigation and a challenge to monetize. Google may have overpaid for such properties as DoubleClick.com and gobbled small companies and done nothing to make them grow. More troublesome is Google’s interest in technologies unrelated to its core business; for example energy and space travel. For me the most important point in the article was this statement:
“Other than search, what has Google done right? They have 1,001 products in beta, but what’s been successful?” Chowdhry {\[an analyst quoted by Mr. Ryan] asked. “There has been a sequence of missteps and failures, and this is not the end. They miscalculated the valuation of AOL, and this is the first time they’re admitting to it.”
Google has a dominant position in Web search and advertising. The company has a track record of success in online advertising. Is it now time to reassess Google as company with a single business model and little else?
Stephen Arnold, August 10, 2008
Sinequa Inks OEM Deal with Oxaproc
August 9, 2008
Sinequa’s information retrieval solution will be integrated into Oxalys Technologies into Oxaproc, according to ITRNews.com. Oxaproc is an e-procurement system. Antoine Renard, director of development at Oxylys said that ease of integration was a factor in the company’s decision to license Sinequa CS. OEM deals are highly prized among search and content processing vendors. A typical deal involves up front cash and a royalty. Once an information retrieval engine has been embedded in an enterprise application, ripping and replacing can give engineers and customers migraines. You can learn more about Oxalys here. Specific information about Oxaproc e-procurement is here. Details about Sinequa are here. You can read the interview with Sinequa’s Jean Ferré top gun here.
Stephen Arnold, August 9, 2008
Intel: What Business Is It In?
August 9, 2008
Intel’s push in cloud computing strikes me as a “me too” response to a customer rebellion that is brewing. Maintaining servers, struggling with heat and power consumption costs, and the the mind-numbing wackiness of enterprise software fuel the shift. Intel in a search for more revenue is looking for a two-fer.
Intel wants to grow its revenue, particularly in its semiconductor business and Intel wants a bigger piece of the action in cloud computing. Can Intel perform this trick? This is a difficult question to answer. Now Intel seems to be probing other markets as well.
On August 8, 2008, Intel surprised me with its release of its Summary Statistics Library. You can read the Web log post by Dmitry Kabaev here. You can download the library here. There is also an installation guide available from the download page. You can choose either the Linux or the Windows library. There are two low key requests for your email, but as far as I could tell, I was able to suck down the libraries without registering. If you want to participate in Intel forums, you will have to cough up some information, but I register my dog, who seems quite happy to ignore his email.
The stats pack is part of Intel’s Whatif.Intel.com initiative. Intel wants to be a good open source citizen, and it is an excellent way to allow developers to start mud wrestling with programming for massively parallel systems. Intel is upfront about this point, describing the library as “a set of algorithms for parallel processing of multi-dimensional datasets. It contains functions for initial analysis of raw data which allow investigating structure of datasets and get their basic characteristics, estimates, and internal dependencies.”
You can whack on data sets with:
- Basic statistics. Algebraic and central moments up to 4th order, skewness, kurtosis, variation coefficient, quantiles and order statistics.
- Estimation of Dependencies. Variance-covariance/correlation matrix, partial variance-covariance/correlation matrix, pooled/group variance-covariance/correlation matrix.
- Data with Outliers. The Intel® Summary Statistics Library contains a tool for detection of outliers in a dataset. Also the library allows computing robust estimates of the covariance matrix and mean in presence of outliers.
- Missing Values. Data which contains missing values can be effectively processed using modern algorithms implemented in the package.
- Out-of-Memory Datasets. Many algorithms of the library support data which cannot fit into the physical memory processing huge data arrays in portions. Specifically, variance-covariance matrix estimators, algebraic and central moments, skewness, kurtosis, and variation coefficient can process a dataset in portions.
- Various Data Storage Formats. The Intel Summary Statistics Library supports in-rows and in-columns storage formats for datasets, full and packed format for variance-covariance matrix.
The libraries support C and Fortran90/95.
Intel has invested in Endeca, and I don’t think this is a casual greenfield seeding. Endeca’s technology performs some interesting processes on structured and unstructured content. I see not overt evidence that Intel is overtly moving into information retrieval. I am tracking announcements like this stats pack as part of my research effort to figure out how Endeca figures in Intel’s plans.
While I root around for information, download the statistics libraries. My quick look revealed some useful work by Intel’s engineers, who merit a happy quack.
Stephen Arnold, August 9, 2008
Google’s Universe Is the Schrodinger Cat’s Meow
August 8, 2008
After a delightful flight delay, I sat down and scanned my email. A helpful reader send me a link to “Schrodinger-Like PageRank Equation and Localization in the WWW.” (I am getting weird results when I try to insert the correct character in Schrodinger. The spelling is a quick zig zag around this problem.) You can read the essay in Archiv.org here. The research involves a number of experts, including some nuclear physicists and a Yahoo researcher. The assertion is:
PageRank can be expressed in terms of a wave function obeying a Schrodinger-like equation.
If true, Google’s PageRank can be calculated using the type of math that a garden variety physics student uses to pass a third year physics course. You can read more about this assertion at:
- http://www.p2pnet.net/story/16653. This post includes a diagram that depicts state changes.
- http://arxivblog.com/?p=558 This post is a brief summary
- http://online.redwoods.cc.ca.us/DEPTS/science/chem/storage/Schrod/ A refresher with an animation to explain the perturbation that occurs in the dynamic system.
Several thoughts crossed my mind as I worked through these materials:
First, the assertion requires verification. The implication of the Web documents is that Google’s PageRank can be replicated with less computation. The implications of this for a company like Yahoo are significant. Adding a lightweight PageRank value to Yahoo’s index could–note the could–improve its query matching.
Second, if true, Google may have to become more open about the many factors it uses to make the PageRank method more useful than a method based on the maths referenced by the European researchers.
Third, Google’s image of the unassailable leader in Web search gets a scorch mark.
I will revisit this subject when I am not sated with the luxury of air travel. More later.
Stephen Arnold, August 9, 2008
Company Profiles Coming to Beyond Search
August 8, 2008
I talked with the team working on this Web log today at lunch. After I bought everyone super burritos, I was able to gather some ideas for making the Beyond Search Web site more useful to me, the team, and the two or three readers out there.
The Search Wizards Speak series on ArnoldIT.com has been well received. Several of the interviews have been recycled and turned up in Web logs in lands far from rural Kentucky and our lone “authentic” Mexican restaurant. One of the people working there has a non-hill folk accent, so the Cantina Kentucky must be muy authentico.
The idea that emerged between mouthfuls of “authentic” burritos was to post one or two page profiles of the companies mentioned in the stories in the Web log. I thought the idea was pretty awful, but the burrito-sated Beyond Search team thought it was wonderful.
Here’s the plan.
I have developed on a restaurant napkin a rough outline for what should be included in each of the company profiles. A team member or one of the writers who work on this Web log will write the profile. I have gigabytes of info about search, and I will let the lucky journalist grind through these data and then tap other sources.
Each profile will have a comments section. If you want to add information or correct an error, use the comments form. Once a year, we will roll the comments into the baseline profile. In this way, you can get some basic information about the companies mentioned in the Web log. You can also update or correct the basic entry.
I think we will be cutting and pasting from company information of search vendors’ Web sites. I am thinking about adding my unique stamp to each write up with my personal “likes and dislikes” for each system. My attorney says he wants to think about this “likes and dislikes” stuff, so stay tuned on that point.
Keep in mind that I do really meaty analyses of companies in the search and content processing business. The profiles, like the interviews in Search Wizards Speak, will provide some useful information but the juicy stuff will not be included.
So what’s juicy?
Well, I just completed ripping through Endeca’s patent documents. I have identified some upsides and downsides to the inventions disclosed. I have then worked through the publicly available information about Endeca, made a couple of calls, and thought about what I have learned. That type of detail is not going to be in these free two-page profiles. Some lucky or silly outfit is going to have to pay me for the slog through the golden prose of lawyers and engineers. The prose makes Henry James’s novels look like the script to the new Batman movie.
I want to post a couple of test profiles and invite comments. I will go slowly at first, but if I can get the kinks worked out, my goal is to have one profile every week or two.
One of the burrito eaters suggested I sell profiles to companies who want the Beyond Search team to write about a specific firm. I am a greedy goose, but I want to put that idea on the back burner until I figure out if this is a feasible activity. There’s a lot of email and chasing required to get an interview completed. I’m not sure about search company profiles. The idea of money is easier to experience than the actual process of squeezing a beet for nectar.
Watch this Web log for a link to the first profile. I’m thinking next week. Comments? Suggestions? Let me know in the comments section below this article.
Stephen Arnold, August 8, 2008
Will Microsoft Bring Home the Gold in the SharePoint Olympics?
August 8, 2008
The Olympics are underway. If you have any questions, you will want to navigate to the Beijing Organizing Committee for the Olympic Games’ portal here. Ooops. That’s not the SharePoint site, and this MSDN article “SharePoint Server 2007 Powers Beijing 2008 Olympic Games” does not include a link to the SharePoint site. You can read this post, dated August 5, 2008l, here. The screenshot featured on the site does not look like any of the pages on the “official” site at http://en.beijing2008.cn/.
Here’s the “official” site’s look and feel:
And here’s the screen shot of Microsoft SharePoint and its “official” site:
I think I have figured out what’s going on, but it would be nice if the MSDN post contained links to pages, not screenshots without a url or trackback link. You can navigate to a July 2008 case study here and learn more about this high profile opportunity for SharePoint. Here’s the architecture diagram for the Microsoft system:
Compared to the SharePoint placemat diagram here, it seems to me that this Olympics’ diagram is a simplified schematic.
One oddity is that the drop down box that one uses to specify the viewer’s country is tough to control The video won’t play until you click on the country, but the scroll function is somewhat immature. The video is displayed on the NBColympics.com Web site, and I was puzzled by the design of that page.
A happy quack to the SharePoint team. Nothing but smooth sailing for the next couple of weeks.
Stephen Arnold, August 8, 2008
More on Search ROI
August 8, 2008
I usually agree with Deep Web Technologies’ commentaries. Sol Lederman has written an interesting essay “Measuring Return on Search Investment.” You will want to read his analysis here. The point of his write up is that Judy Luther, president of Informed Strategies, wrote a white paper about ROI for libraries. The good news in Ms. Luther’s analysis, if I read Mr. Lederman’s summary, correctly is that libraries can show a return on investment in an academic library. As a long time library user, I agree that an investment can pay many dividends.
I do want to push back a bit on library ROI. The sticking point is cost analysis. As long as an institution can chop up costs and squirrel them away, it is very difficult to know what an information service of any type costs. Libraries develop a budget. A tiny fraction of that budget goes for books, electronic information, and journals. Most of the money is sucked up from fixed costs like salaries, maintenance, security, and other institutional overheads.
As a result, the “cost” of an information service is almost always the direct cost at a specific point in time for a specific service or product. Costs associated with figuring out what to buy, installing the product, the share of the infrastructure the product requires, and other costs are ignored. As a result, the calculation that shows a specific return is not too useful.
Without a knowledge of the direct and indirect costs, the basic budget analysis is incomplete. Ignoring the “going forward” costs means that when problems occur, the costs can break the back of the library’s budget. Wacky ROI calculations, particularly where digital information and search are concerned, push library’s deeper into the budget swamp. Here in Kentucky, budgets for online information are now cut. The looming problem will be that chopping a direct cost allows the unmonitored and often unknown dependent costs to continue to chew away at the budget.
Libraries face some severe budget pressure from these long ignored costs. These burn like an underground mine fire, and like an underground mine fire, these costs are often very difficult to control.
Stephen Arnold, August 8, 2008
Microsoft BrowseRank Round Up
August 8, 2008
Looking to compete with Google’s PageRank program, BrowseRank is a Microsoft-developed method of computing page importance for use in Internet search browsers.
The computations are based upon user behavior data and algorithms to “leverage hundreds of millions of users’ implicit voting on page importance.” (So says a Microsoft explanatory paper [http://research.microsoft.com/users/tyliu/files/fp032-Liu.pdf]). The whole point is to add “the human factor” to search to bring up more results people actually want to see.
On July 27 SEO Book posted a review/opinion [http://www.seobook.com/microsoft-search-browserank-research-reviewed] since Steve posted about BrowseRank here [http://arnoldit.com/wordpress/2008/07/26/microsofts-browser-rank/].Summary: While it’s a good idea, there are drawbacks like false returns because of heavy social media traffic, link sites, etc. Sites like Facebook, MySpace, and YouTube are popping up high on the list – not because they have good, solid, popular information, but just because they’re high traffic. Microsoft will have to combine its BrowseRank user feedback information with other data to be really useful. On the other hand, if Microsoft can collect this user data over a longer term, the info would more likely pan out. For example, BrowseRank will measure time spent on a site to help determine importance and relevance.
A blog post on WebProNews [http://www.webpronews.com/topnews/2008/07/28/browserank-the-next-pagerank-says-microsoft] on July 28 said flat out: “It shouldn’t be the links that come in, but the time spent browsing a relevant page, that should help determine where a page ranks for a given query.” So that idea lends some credence to BrowseRank’s plan. The next step is how Microsoft will acquire all that information – obviously through things like their Toolbar, but what else? (Let’s ignore, for now, screams about Internet browsing privacy.) If MSN’s counting on active participation from users, it won’t work. This blog post points out that “Google’s PageRank succeeds partially due to its invisibility.” And that’s what users expect.
Graphic from Microsoft Research Asia
For now, and granted there’s only this small bit of info out there, SEO Book says, in their opinion, PageRank (Google’s product) has the one up on Microsoft because it sorts informational links higher, connects them to Google’s advertising, and because Google has the ability to manipulate the information.
You can read this for more info on Microsoft vs. Google: CNET put out a pretty substantial article [http://news.cnet.com/8301-1023_3-9999038-93.html] on July 25 talking about PageRank vs. BrowseRank and what Microsoft hopes to accomplish.
Autonomy: Another Week, Another Award
August 7, 2008
If I were a search system vendor, I would start to think about myself as a loser. Autonomy continues to win accolades for its information platform. Autonomy may be the winningest information platform vendor in history. The company’s most recent award is the 2008 IP Contact Center Technology Pioneer Award. You can read here the full story “Autonomy etalk Receives the 2008 IP Contact Center Technology Pioneer Award: Customer Interaction Solutions Magazine Recognizes Qfiniti for Advanced IP Call Recording.”
According to the essay, an Autonomy executive said:
etalk is thrilled to be recognized for our advanced recording technology and the flexibility and scalability we provide to our clients. When you combine our next-generation IP recording technology with telephony solutions from the world’s top-rated vendors, our clients are rewarded with the most robust and reliable voice recording solutions on the market.
etalk is a system that can record voice conversations for enterprise contact centers and mission critical business environments. This solution offers full customer interaction recording for compliance, risk management, and quality.
More information is available at http://www.etalk.com/. The link in the news release returns a page not found error. I fixed this glitch for the two or three Beyond Search readers. I wonder if the author “spoke” the link and it was incorrectly parsed or if the mistake was one of those human flubs like the ones I make so often?
Stephen Arnold, August 7, 2008