A Google Amazon Balancing Act

June 10, 2008

CNet featured an essay by Charles Cooper. You can read it here. Click the link quickly. I continue to receive emails from people telling me my links are dead. Some sites move stories around; others just delete them. The title–“Google’s Right but Cloud Computing Timeline Isn’t So Clear–is the type that catches my attention, but the core information really hooked me.

Mr. Cooper references a talk by Googler Rishi Chandra at the Enterprise 2.0 Conference in Boston. You can read the CNet write up of Mr. Chandra’s presentation here. The main point of the Googler’s remarks is that cloud computing is the future. That’s an old message, but in the spring and summer Google transparency offensive of 2008, it’s becoming clear that Google believes in network computing. Okay. Maybe this is old news. The implication is that Google is serious about the enterprise market. Okay. Also old news. Mr. Cooper describes Mr. Chandra’s revelations and insights in an objective manner. I don’t think I could have done that were I reporting on the Googler’s talk.

Mr. Cooper does an excellent job of summarizing the Google “game plan”, and I won’t attempt to summarize his clear, tight writing.

But the payload of this must-read article is that Mr. Chandra made a gentle reference to the reliability of certain cloud-based services. When I saw this, my radar lit up. After years of ignoring Amazon’s push into services and features that are easy for Google to deliver, Google seems to be jerking its Googzilla-sized self into action. Amazon is making an effort to out Google Google at a fraction of the amount that Google spends on technology. Amazon, based on my research, is doing cloud computing on a very abstemious sum that is about one fifth of Google’s. My accountant father would be proud of Mr. Bezos’ penny pinching. I’m from a different generation, and I learned in the nuclear power work I did in the early 1970s that it pays to engineer certain functions without cutting corners. A flawed infrastructure is bad news in a BWR (boiling water reactor) and not-so-good news in a cloud-computing system.

For me, this single passing reference translates to increased pressure from the Google enterprise team. In the column I submitted to KMWorld which will run in either July or August 2008, I describe how “Google physics” work. Imagine my delight when Mr. Cooper provided additional information to buttress my analysis. I’m not going to explain “Google physics” in this post. You will have to wait until the KMWorld publication becomes available, but you can deduce that when the pressure goes up, the competitive arena behaves differently. The GOOG may be cranking up the heat now.

Kudos, Mr. Cooper. I appreciate a thorough reporting job.

Stephen Arnold, June 10, 2008

Mobile Search

June 10, 2008

I try to steer clear of mobile search. The notion is broad and like most terms used to describe information retrieval the phrase mobile search is frequently undefined. The idea, I assume, is that everyone knows what mobile search is.

I asked my neighbor what mobile search was, and he said, “I just use my phone for calls.” Functions like sending a query to Yahoo’s mobile service aren’t used very often by me, not at all by him, and probably not by you, gentle reader, either.

But if you you get text or graphic information on a mobile device, it’s mobile search. Most pundits feel that this definition is close enough for horse shoes. The problem is that it is the equivalent of cutting a cherry pie with a Husqvarna 455 Rancher chain saw, a popular model here in the hills of Kentucky.

mobile search disappoints

This is a photograph of a Beyond Search programmer expressing dissatisfaction with the mobile search function on an Apple iPhone and a Treo 650. “Both are terrible,” says ArnoldIT.com’s chief technical officer.

The USA Today business section ran this front page story on June 10, 2008: “Are Google, Yahoo the Next Dinosaurs?” I couldn’t find the story on USAToday’s Web site. If it does appear online, I think this is the link that will display it for you. If you can’t locate this story online, you may have to hunt for a tree-unfriendly printed version.

The story, written by Leslie Cauley, is that “many [vendors are] on the hunt for a way to cash in on wireless search.” The idea is that no one, not even Google, Microsoft, or Yahoo have cracked the code for mobile search. The “dinosaur” part is a bit of color. The notion is that because neither Google nor Yahoo have cracked the code for mobile search, these two firms could be left in the dust by younger, more hip innovators. Ergo: Google and Yahoo become the brontosauri of online with regard to mobile search. Ms. Cauley mentions an up-and-coming company called Medio, careful to explain that this is just one interesting company among many. You can read more about Medio here. Could Medio be the next Google?

Because mobile devices are more plentiful than other types of computers, whoever cracks the code can make boat loads of cash selling ads to mobile phone device users running search. I’m not going to cite USAToday’s statistics. I have heard that Gannett takes a dim view of old researchers tapping into their high-value statistical data captured in bar charts without data tables.

I urge you buy “America’s newspaper”; make Gannett’s accountants happy.

The challenges of mobile search are formidable. There are established business models ossified in the American telecommunications industry. There are device issues; namely, screens smaller than the 48 inches of flat panel I have in front of me at this moment, lousy keyboards, and users who aren’t too keen on taking time to paw through a laundry lists of results.

Read more

Amazon and Cloud Computing

June 10, 2008

Intelligent Enterprise’s very good journalist Doug Henschen has an excellent interview with Amazon’s Adam Selipsky here. The interview is meaty, running to three sections. (Snag this now. Content can be tricky to find after a day or two.) Mr. Henschen has made an effort to capture the detail that many editors deem irrelevant. This interview is a keeper if you are a collector of Amazon business thought.

Two key points struck me as I was reading Amazon’s VP of product management’s answers to Mr. Henschen’s questions. Let me highlight these two points and then offer several observations.

The first point is this comment by Mr. Selipsky:

I’d also say that part of the whole point of computing in the cloud is that you don’t have to worry about where the resources are. If you decide to do enterprise backup in S3 — which is a great application that we’re seeing more and more of, by the way — do you really need to know exactly where and when those copies are replicated and how our replication works? That’s time you could be spending on something else. What you want to know is that we’re never going to lose the data, that your people are going to be able to access the data whenever they want, and that it’s secure.

As I understand this comment, Mr. Selipsky wants to shift a customer’s concern from the physical location of data to comfort that “we’re never going to lose the data.” I recall that one of my college professors explained the notion of categorical affirmatives and categorical negatives. In my aging Kentucky brain I have boiled down that learning to one precept: Never say never. I think Mr. Selipsky is saying “never”.

The second point is this exchange between Mr. Henschen and Mr. Selipsky. I have marked Mr. Henschen’s question with the label “question” and Mr. Selipsky’s response as “answer”:

Question: Would you say that Internet-based, customer-facing businesses like Amazon are at the center of the target and that enterprises business might be ceded, one day, to cloud vendors that specialize in enterprise services?

Answer: I would absolutely not say that. We had Fortune 500 companies using Amazon Web Services literally from the day we launched S3. It’s true that the majority of our early usage was from smaller companies and startups because they tend to be a little more risk tolerant. We thought it would take several years to get to the enterprise stage in a meaningful way, but we’ve actually been rather surprised at how quickly enterprise adoption and interest have accelerated.

As I understand Mr. Selipsky’s remark, Amazon is prepared to compete with firms yet to enter the cloud computing arena. For example, there’s the loose tie up between Google and Salesforce.com. There’s the rumor of interest in utility services from Verizon, AT&T, and other telecommunications companies. I’ve heard chatter that Cisco and Intel as unlikely as this seems to me are thinking about cloud computing. Most of these companies have the potential to make a monopoly play in cloud computing. Also, I think I see another categorical in Mr. Selipsky’s remark: “absolutely”. I find this an interesting word choice in a market sector that is in its infancy.

Observations

Let me wrap up with several unasked for remarks:

  1. I find that categorical affirmatives and negatives jejune. Maybe the word choice is colloquial and unintentional, but I think it shows the shallowness of Amazon’s positioning of its Web services. It’s tough for me to accept absolutes and categoricals when so much is uncertain in this cloud computing space.
  2. My recollection is that Amazon’s core service suffered service outages on Friday, June 6, 2008, and then again on Monday, June 9, 2008. The information about the problem has been sparse. Regardless of the cause, these recent problems suggest that Amazon’s engineering has some details to which it must attend.
  3. Certain services naturally tend toward monopolies. For example, I don’t buy my electric power in rural Kentucky from the cheapest vendor. I use the only vendor, a non-US outfit with control of the local power company. I think cloud computing may share some DNA with the pre-break up AT&T; that is, it will make economic sense for one company to control the market. Because cloud computing is an engineering problem, I think the winner will be a company that can do great engineering economically. A system failure doesn’t translate in my mind to great engineering.

Agree? Disagree? Let me know. Use the comments section to push back or provide additional insight. This is a Web log, so weigh in.

Report about Amazon technical issue from Network World is here.

Stephen Arnold, June 10, 2008

Exalead: Enterprise Search Heats Up

June 9, 2008

Exalead, founded by former AltaVista.com guru, François Bourdoncle named a US president for the privately-held search and content processing company. Paul Doscher will work from Exalead’s San Francisco office. Mr. Doscher is an experienced executive. He will tackle Exalead’s OEM strategy, assist the online business market launch, and consolidate the firm’s activities in in business-to-business markets.

Mr. Doscher joins Exalead from Jaspersoft, which under his management became the worldwide leader in the open source business intelligence market. This expertise combined with nearly thirty years experience in the IT industry at companies including VMware and Oracle, adds muscle to the fast-growing Exalead.

Exalead has been growing at double-digit rates for the last two years. Offering its platform exalead one:search; software which scales from desktop to data center entry level points, Exalead enables industrial informational access which incorporates structured and unstructured data as well as internal and external content for individuals and organizations alike to retrieve and utilize effectively.

The firm has a platform that shares many of the characteristics of low-cost scaling and high performance with Google. You can find more information about Exalead here. In my April 2008 study for the Gilbane Group, Exalead was named a “company to watch”.

Stephen Arnold, June 10, 2008

Deep Web Tech’s Abe Lederman Interviewed

June 9, 2008

Abe Lederman, one of the founders of Verity, created Deep Web Technologies to provide “one-stop access to multiple research resources.” By 1999, Deep Web Technologies offered a system that performed “federated search.” Mr. Lederman defines “federated search” as a system that “allows users to search multiple information sources in parallel.” He added in his interview with ArnoldIT.com:

Results are retrieved, aggregated, ranked and deduped. This doesn’t seem too difficult, but trust me it’s much harder than one might think. Deep Web started out building federated search solutions for the Federal government. We run some highly visible public sites such as Science.gov, WorldWideScience.org and Scitopia.org. We have expanded our market in the last few years and sell to corporate libraries as well as academic libraries.

believes that Google’s “forms” technology to index the content of dynamic Web sites is flawed.

Mr. Lederman said:

Deep Web goes out and in real-time sends out search requests to information sources. Each such request is equivalent to a user going to the search form of an information source and filling the form out. Google is attempting to do something different. Using automated tools Google is filling out forms that when executed will retrieve search results which can then be downloaded and indexed by Google. This effort has a number of flaws, including automated tools that fill out forms with search terms and retrieve results will only work on a small subset of forms. Google will not be able to download every document in a database as it is only going to be issuing random or semi-random queries.

In the exclusive interview, Mr. Lederman reveals a new feature. He calls is “smart clustering.” Search results within a cluster are displayed in rank order.

You can read the full text of the interview on the ArnoldIT.com Web site in its Search Wizards Speak series. The interview with Mr. Lederman is the 17th interview with individuals who have had an impact on search and content processing. Search Wizards Speak provides an oral history in transcript form of the origin, functions, and positioning of commercial search and text processing systems.

The interview with Mr. Lederman is here. The index of previous interviews is here.

Stephen Arnold, June 9, 2008

Chicago Tribune Online: Why Old Print Subscribers Will Hate the Online Edition

June 8, 2008

I don’t spend much time writing about user interface or usability. My 86-year-old father, however, forced me to confront the interface for the Chicago Tribune Online. This essay has a search angle, but the majority of my comments apply to the interface for the Chicago Tribune Online. Now if you search Google for “Chicago Tribune Online”, the fist hit is the Chicago Tribune’s main Web site. There is no direct link to the electronic edition for subscribers. You can find this service, which requires a user name and password, here. An 86-year-old person doesn’t file email like his 64 year-old son or the 12-year-old who lives in the neighborhood.

My father prints out important email. This makes it tricky for him to type in the url, enter his user name and password (a helpful eight letters and digits all in upper case so it’s impossible for him to discern whether the zero is a number or an “oh” for the letter.

Why does this matter?

I set up yesterday (June 6, 2008) an icon that contained sufficient pixie dust to send him to the electronic edition and log him in automatically. This morning he called to tell me that he had nuked his icon. I dutifully explained in an email, which he would print out, how to navigate to the page, enter the user name, enter the eight digit password (remember there are two possibilities for the zero), click the “save user name and password option” and access the Sunday newspaper.

Essentially these steps are beyond his computing ability, visual acuity, and keyboarding skills.

Does the Chicago Tribune care? My view is that whoever designed the access Web page gave little thought to the needs of my father. Why should these 20 somethings? Their world is one in which twitching icons and subtle interfaces with designer colors are irrelevant.

There’s one other weirdness about the log in page for the electronic edition of the Chicago Tribune. My father has a big flat screen, and I set it for 800 by 600 pixels so he can read the text. The problem with this size is that most Web pages, including the ones for this Beyond Search Web log are designed for larger displays. I use three displays–two for the Windows machine and one big one for the Mac. Linux machines get cast off monitors which we often unplug once the machine is running because no one “uses” the Linux machines perched in front of the boxes.

Not my father, he gets up close and personal. The failure to design for my father is understandable. Life would be easier if people were perpetually 21. Here’s the full text of the help tips in the email the Chicago Tribune sent my father:

Getting started with your Chicago Tribune electronic subscription: 1. To view a story, photo, or advertisement click the item on the full-page image (left side of your screen). It will enlarge on the right side of your screen for easier reading. 2. Use the pull-down lists located in the top center to navigate through which section and page you would like to view. 3. Use “Advanced Search” on the top center area of the window to find a specific article. 4. Use the buttons on the right to email or print each page. Use the buttons on the left to set up email alerts through e-notify and download articles or the entire paper as a PDF. 5. For more help on all the features, just click on the “Help” button found near the top left under the Chicago Tribune logo.

So, here’s what my father sees when he clicks on the electronic edition link on the 800 x 600 display in his browser:

tribune 800 600

I had trouble figuring out what button and what option was described in the “help” with the registration email. Know why? The log in information requires my father to scroll to the left and then down. There is no visible clue about the log in.

Read more

The Semantic Chimera

June 8, 2008

GigaOM has a very good essay about semantic search. What I liked was the inclusion of screen shots of results of natural language queries–that is, queries without Boolean operators. Two systems indexing Wikipedia are available in semantic garb: Cognition here and Powerset here. (Note: there is another advanced text processing company called Cognition Technologies whose url is www.cognitiontech.com. Don’t confuse these two firms’ technologies.) GigaOM does a good job of making posts findable, but I recommend navigating to the Web log immediately.

Nitin Karandikar reviews both Cognition’s and Powerset’s approach, so I don’t need to rehash that material. For me the most important statement in the essay is this one:

There are still queries (especially when semantic parsing is not involved) in which Google results are much better than [sic] either Powerset or Cognition.

Let me offer several observations about semantic technology applied to constrained domains of content like the Wikipedia:

  1. Semantic technology is extremely important in text processing. By itself, it is not a silver bullet. A search engine vendor can say, “We use semantic technology”. The payoff, as the GigaOM essay makes clear, may not be immediately evident. Hence, the “Google is better” type statement.
  2. Semantic technology is in many search systems, just not given center state. Like Bayesian maths, semantic technology is part of the search engine vendors’ toolkits. Semantic technology delivers very real benefits in functions from disambiguation to entity extraction. As this statement implies, there are many different types of semantics in the semantic technology spectrum. Picking the proper chunk of semantic technology for a particular process is complicated stuff, and most search engine vendors don’t provide much information about what they do, where they get the technology, or how the engineers determined which semantic widget to use in the first place. In my experience, the engineers arrive at their job with academic and work experience. Those factors often play a more important part than rigorous testing.
  3. Google has semantic technology in its gun sights. In February 2007, information became available about Google programmable search engine which has semantics in its plumbing. These patent applications state that Google can discern context from various semantic operations. Google–despite its sudden willingness to talk in fora about its universal search and openness–doesn’t say much about semantics and for good reason. It’s plumbing, not a service. Google has pretty good plumbing, and its results are relevant to many users. Google doesn’t dwell on the nitty gritty of its system. It’s a secret ingredient and no user really cares. Users want answers or relevant information, not a lab demo of a single text processing discipline.
  4. Most users don’t want to type more than 2.2 words in a query. Forget typing well formed queries in natural language. Users expect the system to understand what is needed and the situation into which the information fits. Semantic technology, therefore, is an essential component of figuring out meaning and intention. Properly functioning semantic processes produce an answer. The GigaOM essay makes it clear that when the answers are not comprehensive, on point, or what the user wanted, semantic technology is just another buzz word. Semantic technology is incredibly important, just not as an explicit function for the user to access.

I talk about semantic technology, linguistic technologies, and statistical technologies in this Web log and in my new study for the Gilbane Group. The bottom line is that search doesn’t pivot on one approach. Marketers have a tough time explaining how their systems work, and these folks often fall back on simplifications that blur quite different things. Mash ups are good in some contexts, but in understanding how a Powerset integrates a licensed technology from Xerox PARC and how that differs from Cognition’s approach, simplifications are of modest value.

In my experience, a company which starts out as statistics only quickly expands the system to handle semantics and linguistics. The reason–there’s no magic formula that makes search work better. Search systems are dynamic, and the engineers bolt new functions on in the hope of finding something that will convert a demo into a Google killer. That has not happened yet, but it will. When a better Google emerges, describing it as a semantic search system will not tell the entire story. Plumbing that runs compute intensive processes to cruch log data and smart software are important too.

A demo is not a scalable commercial system. By definition a service like Google’s incorporates many systems and methods. Search requires more than one buzz word.You may also find the New York Times’s Web log post by Miguel Helft about Powerset helpful. It is here.

Stephen Arnold, June 8, 2008

Mobile Projection: Truly Stunning

June 7, 2008

Information Week reporter K.C. Jones reported an iSuppli estimate that stunned me. The title of the story is “Wireless Social Networking To Generate $2.5 Trillion By 2020.” You can read it here, but hurry news has a peculiar way of becoming hard to find a day or two after the story appears on a Web site.

iSuppli–a company in the business of providing applied market intelligence–projected that Wireless social networking products, services, applications, components, and advertising will generate more than $2.5 trillion in revenue by 2020. I think that’s 12 zeros.

I’m not sure I know what wireless social networking is but if iSuppli is correct–consultants and research firms are rarely off base–it’s a great opportunity for entrepreneurs who catch the wave. Wow, $2.5 trillion in 12 short years. I thought I had seen some robust estimates from Forrester, Gartner, 451, and ComScore, but the iSuppli projection is a keeper.

Stephen Arnold, June 7, 2008

Search Wizard Starts New Venture

June 6, 2008

Years ago I examined search technology developed by a teen age whiz named Judd Bowman. You can read about his background here. Mr. Bowman and an equally talented Taylor Brockman had devised a way around memory access bottlenecks that hobbled other search companies’ performance. Mr. Bowman founded Pinpoint, which became Motricity. With a keen interest in search, Motricity provided a range of technology to a number of high profile clients, including Motroloa.

Mr. Bowman’s new venture is PocketGear, formerly a unit of Motricity. There’s not much information available at this time. Based on my knowledge of Mr. Bowman’s interest in search, the company will offer mobile search and content services. The venture warrants close observation, particularly with regard to mobile on device applications and cloud based search and retrieval.

Note: the Charlotte News Observer article quotes me and cites my for-fee work for investors in Messrs. Bowman and Brockman’s first company, Pinpoint.

Stephen Arnold, June 6, 2008

Google: Tighter Time Controls

June 6, 2008

Valley Wag, one of the Web logs I enjoy immensely, reports that Google’s 20 percent free time for personal projects policy may be changing. You can read the original news story here.

The key point for me was this observation:

What we hear from Googlers is that supervisors are cracking down on use of 20 percent time when employees’ main projects are behind schedule. A sensible management move, but against the spirit of 20 percent time, which was meant to liberate creative employees from meddling middle management.

Google is now a decade old an rocketing forward in many business sectors. The implication is that Google needs more productivity. The flip side is that the idea of having the equivalent of one day each week to work on projects that interest a Googler is now public relations.

My sources tell me that this is not a change in policy, just a reaction to work load. If I learn more, I will let you know. I calculated that the 20 percent rule if applied to 12,000 engineers with an average salary of $130,000 per year including benefits added hundreds of millions of research costs to the company. This cost does not appear as part of Google’s “regular” R&D activity, but the approach has produced some interesting innovations for the company.

Stephen Arnold, June 6, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta