Amazon and Cloud Computing

June 10, 2008

Intelligent Enterprise’s very good journalist Doug Henschen has an excellent interview with Amazon’s Adam Selipsky here. The interview is meaty, running to three sections. (Snag this now. Content can be tricky to find after a day or two.) Mr. Henschen has made an effort to capture the detail that many editors deem irrelevant. This interview is a keeper if you are a collector of Amazon business thought.

Two key points struck me as I was reading Amazon’s VP of product management’s answers to Mr. Henschen’s questions. Let me highlight these two points and then offer several observations.

The first point is this comment by Mr. Selipsky:

I’d also say that part of the whole point of computing in the cloud is that you don’t have to worry about where the resources are. If you decide to do enterprise backup in S3 — which is a great application that we’re seeing more and more of, by the way — do you really need to know exactly where and when those copies are replicated and how our replication works? That’s time you could be spending on something else. What you want to know is that we’re never going to lose the data, that your people are going to be able to access the data whenever they want, and that it’s secure.

As I understand this comment, Mr. Selipsky wants to shift a customer’s concern from the physical location of data to comfort that “we’re never going to lose the data.” I recall that one of my college professors explained the notion of categorical affirmatives and categorical negatives. In my aging Kentucky brain I have boiled down that learning to one precept: Never say never. I think Mr. Selipsky is saying “never”.

The second point is this exchange between Mr. Henschen and Mr. Selipsky. I have marked Mr. Henschen’s question with the label “question” and Mr. Selipsky’s response as “answer”:

Question: Would you say that Internet-based, customer-facing businesses like Amazon are at the center of the target and that enterprises business might be ceded, one day, to cloud vendors that specialize in enterprise services?

Answer: I would absolutely not say that. We had Fortune 500 companies using Amazon Web Services literally from the day we launched S3. It’s true that the majority of our early usage was from smaller companies and startups because they tend to be a little more risk tolerant. We thought it would take several years to get to the enterprise stage in a meaningful way, but we’ve actually been rather surprised at how quickly enterprise adoption and interest have accelerated.

As I understand Mr. Selipsky’s remark, Amazon is prepared to compete with firms yet to enter the cloud computing arena. For example, there’s the loose tie up between Google and Salesforce.com. There’s the rumor of interest in utility services from Verizon, AT&T, and other telecommunications companies. I’ve heard chatter that Cisco and Intel as unlikely as this seems to me are thinking about cloud computing. Most of these companies have the potential to make a monopoly play in cloud computing. Also, I think I see another categorical in Mr. Selipsky’s remark: “absolutely”. I find this an interesting word choice in a market sector that is in its infancy.

Observations

Let me wrap up with several unasked for remarks:

  1. I find that categorical affirmatives and negatives jejune. Maybe the word choice is colloquial and unintentional, but I think it shows the shallowness of Amazon’s positioning of its Web services. It’s tough for me to accept absolutes and categoricals when so much is uncertain in this cloud computing space.
  2. My recollection is that Amazon’s core service suffered service outages on Friday, June 6, 2008, and then again on Monday, June 9, 2008. The information about the problem has been sparse. Regardless of the cause, these recent problems suggest that Amazon’s engineering has some details to which it must attend.
  3. Certain services naturally tend toward monopolies. For example, I don’t buy my electric power in rural Kentucky from the cheapest vendor. I use the only vendor, a non-US outfit with control of the local power company. I think cloud computing may share some DNA with the pre-break up AT&T; that is, it will make economic sense for one company to control the market. Because cloud computing is an engineering problem, I think the winner will be a company that can do great engineering economically. A system failure doesn’t translate in my mind to great engineering.

Agree? Disagree? Let me know. Use the comments section to push back or provide additional insight. This is a Web log, so weigh in.

Report about Amazon technical issue from Network World is here.

Stephen Arnold, June 10, 2008

Exalead: Enterprise Search Heats Up

June 9, 2008

Exalead, founded by former AltaVista.com guru, François Bourdoncle named a US president for the privately-held search and content processing company. Paul Doscher will work from Exalead’s San Francisco office. Mr. Doscher is an experienced executive. He will tackle Exalead’s OEM strategy, assist the online business market launch, and consolidate the firm’s activities in in business-to-business markets.

Mr. Doscher joins Exalead from Jaspersoft, which under his management became the worldwide leader in the open source business intelligence market. This expertise combined with nearly thirty years experience in the IT industry at companies including VMware and Oracle, adds muscle to the fast-growing Exalead.

Exalead has been growing at double-digit rates for the last two years. Offering its platform exalead one:search; software which scales from desktop to data center entry level points, Exalead enables industrial informational access which incorporates structured and unstructured data as well as internal and external content for individuals and organizations alike to retrieve and utilize effectively.

The firm has a platform that shares many of the characteristics of low-cost scaling and high performance with Google. You can find more information about Exalead here. In my April 2008 study for the Gilbane Group, Exalead was named a “company to watch”.

Stephen Arnold, June 10, 2008

Deep Web Tech’s Abe Lederman Interviewed

June 9, 2008

Abe Lederman, one of the founders of Verity, created Deep Web Technologies to provide “one-stop access to multiple research resources.” By 1999, Deep Web Technologies offered a system that performed “federated search.” Mr. Lederman defines “federated search” as a system that “allows users to search multiple information sources in parallel.” He added in his interview with ArnoldIT.com:

Results are retrieved, aggregated, ranked and deduped. This doesn’t seem too difficult, but trust me it’s much harder than one might think. Deep Web started out building federated search solutions for the Federal government. We run some highly visible public sites such as Science.gov, WorldWideScience.org and Scitopia.org. We have expanded our market in the last few years and sell to corporate libraries as well as academic libraries.

believes that Google’s “forms” technology to index the content of dynamic Web sites is flawed.

Mr. Lederman said:

Deep Web goes out and in real-time sends out search requests to information sources. Each such request is equivalent to a user going to the search form of an information source and filling the form out. Google is attempting to do something different. Using automated tools Google is filling out forms that when executed will retrieve search results which can then be downloaded and indexed by Google. This effort has a number of flaws, including automated tools that fill out forms with search terms and retrieve results will only work on a small subset of forms. Google will not be able to download every document in a database as it is only going to be issuing random or semi-random queries.

In the exclusive interview, Mr. Lederman reveals a new feature. He calls is “smart clustering.” Search results within a cluster are displayed in rank order.

You can read the full text of the interview on the ArnoldIT.com Web site in its Search Wizards Speak series. The interview with Mr. Lederman is the 17th interview with individuals who have had an impact on search and content processing. Search Wizards Speak provides an oral history in transcript form of the origin, functions, and positioning of commercial search and text processing systems.

The interview with Mr. Lederman is here. The index of previous interviews is here.

Stephen Arnold, June 9, 2008

Chicago Tribune Online: Why Old Print Subscribers Will Hate the Online Edition

June 8, 2008

I don’t spend much time writing about user interface or usability. My 86-year-old father, however, forced me to confront the interface for the Chicago Tribune Online. This essay has a search angle, but the majority of my comments apply to the interface for the Chicago Tribune Online. Now if you search Google for “Chicago Tribune Online”, the fist hit is the Chicago Tribune’s main Web site. There is no direct link to the electronic edition for subscribers. You can find this service, which requires a user name and password, here. An 86-year-old person doesn’t file email like his 64 year-old son or the 12-year-old who lives in the neighborhood.

My father prints out important email. This makes it tricky for him to type in the url, enter his user name and password (a helpful eight letters and digits all in upper case so it’s impossible for him to discern whether the zero is a number or an “oh” for the letter.

Why does this matter?

I set up yesterday (June 6, 2008) an icon that contained sufficient pixie dust to send him to the electronic edition and log him in automatically. This morning he called to tell me that he had nuked his icon. I dutifully explained in an email, which he would print out, how to navigate to the page, enter the user name, enter the eight digit password (remember there are two possibilities for the zero), click the “save user name and password option” and access the Sunday newspaper.

Essentially these steps are beyond his computing ability, visual acuity, and keyboarding skills.

Does the Chicago Tribune care? My view is that whoever designed the access Web page gave little thought to the needs of my father. Why should these 20 somethings? Their world is one in which twitching icons and subtle interfaces with designer colors are irrelevant.

There’s one other weirdness about the log in page for the electronic edition of the Chicago Tribune. My father has a big flat screen, and I set it for 800 by 600 pixels so he can read the text. The problem with this size is that most Web pages, including the ones for this Beyond Search Web log are designed for larger displays. I use three displays–two for the Windows machine and one big one for the Mac. Linux machines get cast off monitors which we often unplug once the machine is running because no one “uses” the Linux machines perched in front of the boxes.

Not my father, he gets up close and personal. The failure to design for my father is understandable. Life would be easier if people were perpetually 21. Here’s the full text of the help tips in the email the Chicago Tribune sent my father:

Getting started with your Chicago Tribune electronic subscription: 1. To view a story, photo, or advertisement click the item on the full-page image (left side of your screen). It will enlarge on the right side of your screen for easier reading. 2. Use the pull-down lists located in the top center to navigate through which section and page you would like to view. 3. Use “Advanced Search” on the top center area of the window to find a specific article. 4. Use the buttons on the right to email or print each page. Use the buttons on the left to set up email alerts through e-notify and download articles or the entire paper as a PDF. 5. For more help on all the features, just click on the “Help” button found near the top left under the Chicago Tribune logo.

So, here’s what my father sees when he clicks on the electronic edition link on the 800 x 600 display in his browser:

tribune 800 600

I had trouble figuring out what button and what option was described in the “help” with the registration email. Know why? The log in information requires my father to scroll to the left and then down. There is no visible clue about the log in.

Read more

The Semantic Chimera

June 8, 2008

GigaOM has a very good essay about semantic search. What I liked was the inclusion of screen shots of results of natural language queries–that is, queries without Boolean operators. Two systems indexing Wikipedia are available in semantic garb: Cognition here and Powerset here. (Note: there is another advanced text processing company called Cognition Technologies whose url is www.cognitiontech.com. Don’t confuse these two firms’ technologies.) GigaOM does a good job of making posts findable, but I recommend navigating to the Web log immediately.

Nitin Karandikar reviews both Cognition’s and Powerset’s approach, so I don’t need to rehash that material. For me the most important statement in the essay is this one:

There are still queries (especially when semantic parsing is not involved) in which Google results are much better than [sic] either Powerset or Cognition.

Let me offer several observations about semantic technology applied to constrained domains of content like the Wikipedia:

  1. Semantic technology is extremely important in text processing. By itself, it is not a silver bullet. A search engine vendor can say, “We use semantic technology”. The payoff, as the GigaOM essay makes clear, may not be immediately evident. Hence, the “Google is better” type statement.
  2. Semantic technology is in many search systems, just not given center state. Like Bayesian maths, semantic technology is part of the search engine vendors’ toolkits. Semantic technology delivers very real benefits in functions from disambiguation to entity extraction. As this statement implies, there are many different types of semantics in the semantic technology spectrum. Picking the proper chunk of semantic technology for a particular process is complicated stuff, and most search engine vendors don’t provide much information about what they do, where they get the technology, or how the engineers determined which semantic widget to use in the first place. In my experience, the engineers arrive at their job with academic and work experience. Those factors often play a more important part than rigorous testing.
  3. Google has semantic technology in its gun sights. In February 2007, information became available about Google programmable search engine which has semantics in its plumbing. These patent applications state that Google can discern context from various semantic operations. Google–despite its sudden willingness to talk in fora about its universal search and openness–doesn’t say much about semantics and for good reason. It’s plumbing, not a service. Google has pretty good plumbing, and its results are relevant to many users. Google doesn’t dwell on the nitty gritty of its system. It’s a secret ingredient and no user really cares. Users want answers or relevant information, not a lab demo of a single text processing discipline.
  4. Most users don’t want to type more than 2.2 words in a query. Forget typing well formed queries in natural language. Users expect the system to understand what is needed and the situation into which the information fits. Semantic technology, therefore, is an essential component of figuring out meaning and intention. Properly functioning semantic processes produce an answer. The GigaOM essay makes it clear that when the answers are not comprehensive, on point, or what the user wanted, semantic technology is just another buzz word. Semantic technology is incredibly important, just not as an explicit function for the user to access.

I talk about semantic technology, linguistic technologies, and statistical technologies in this Web log and in my new study for the Gilbane Group. The bottom line is that search doesn’t pivot on one approach. Marketers have a tough time explaining how their systems work, and these folks often fall back on simplifications that blur quite different things. Mash ups are good in some contexts, but in understanding how a Powerset integrates a licensed technology from Xerox PARC and how that differs from Cognition’s approach, simplifications are of modest value.

In my experience, a company which starts out as statistics only quickly expands the system to handle semantics and linguistics. The reason–there’s no magic formula that makes search work better. Search systems are dynamic, and the engineers bolt new functions on in the hope of finding something that will convert a demo into a Google killer. That has not happened yet, but it will. When a better Google emerges, describing it as a semantic search system will not tell the entire story. Plumbing that runs compute intensive processes to cruch log data and smart software are important too.

A demo is not a scalable commercial system. By definition a service like Google’s incorporates many systems and methods. Search requires more than one buzz word.You may also find the New York Times’s Web log post by Miguel Helft about Powerset helpful. It is here.

Stephen Arnold, June 8, 2008

Mobile Projection: Truly Stunning

June 7, 2008

Information Week reporter K.C. Jones reported an iSuppli estimate that stunned me. The title of the story is “Wireless Social Networking To Generate $2.5 Trillion By 2020.” You can read it here, but hurry news has a peculiar way of becoming hard to find a day or two after the story appears on a Web site.

iSuppli–a company in the business of providing applied market intelligence–projected that Wireless social networking products, services, applications, components, and advertising will generate more than $2.5 trillion in revenue by 2020. I think that’s 12 zeros.

I’m not sure I know what wireless social networking is but if iSuppli is correct–consultants and research firms are rarely off base–it’s a great opportunity for entrepreneurs who catch the wave. Wow, $2.5 trillion in 12 short years. I thought I had seen some robust estimates from Forrester, Gartner, 451, and ComScore, but the iSuppli projection is a keeper.

Stephen Arnold, June 7, 2008

Search Wizard Starts New Venture

June 6, 2008

Years ago I examined search technology developed by a teen age whiz named Judd Bowman. You can read about his background here. Mr. Bowman and an equally talented Taylor Brockman had devised a way around memory access bottlenecks that hobbled other search companies’ performance. Mr. Bowman founded Pinpoint, which became Motricity. With a keen interest in search, Motricity provided a range of technology to a number of high profile clients, including Motroloa.

Mr. Bowman’s new venture is PocketGear, formerly a unit of Motricity. There’s not much information available at this time. Based on my knowledge of Mr. Bowman’s interest in search, the company will offer mobile search and content services. The venture warrants close observation, particularly with regard to mobile on device applications and cloud based search and retrieval.

Note: the Charlotte News Observer article quotes me and cites my for-fee work for investors in Messrs. Bowman and Brockman’s first company, Pinpoint.

Stephen Arnold, June 6, 2008

Google: Tighter Time Controls

June 6, 2008

Valley Wag, one of the Web logs I enjoy immensely, reports that Google’s 20 percent free time for personal projects policy may be changing. You can read the original news story here.

The key point for me was this observation:

What we hear from Googlers is that supervisors are cracking down on use of 20 percent time when employees’ main projects are behind schedule. A sensible management move, but against the spirit of 20 percent time, which was meant to liberate creative employees from meddling middle management.

Google is now a decade old an rocketing forward in many business sectors. The implication is that Google needs more productivity. The flip side is that the idea of having the equivalent of one day each week to work on projects that interest a Googler is now public relations.

My sources tell me that this is not a change in policy, just a reaction to work load. If I learn more, I will let you know. I calculated that the 20 percent rule if applied to 12,000 engineers with an average salary of $130,000 per year including benefits added hundreds of millions of research costs to the company. This cost does not appear as part of Google’s “regular” R&D activity, but the approach has produced some interesting innovations for the company.

Stephen Arnold, June 6, 2008

Microsoft: Role-Based Approach to Enterprise Apps

June 6, 2008

Colin Barker, ZDNet UK, wrote an interesting article “Microsoft Launches Connected, Role-Based CRM.” You will want to read the full story here. The key idea is that Dynamics AX 2009 (one of the different flavors of customer relationship management software Microsoft sells) supports roles. The idea is that a user, once assigned a role, interacts with the system from the point of view of the role. The article quotes Microsoft’s Gary Turner, who makes this point:

This is different from the way in which ERP systems have worked in the past, where everyone has one ‘vanilla’ front end… A chief executive will look at the information differently from someone in marketing or whatever. Your needs and requirements will be different.

The system also supports direct connections to eBay (the troubled online retailer) and PayPal. The system, if I understand Mr. Turner correctly, supports smartphone access. The default Dynamics user-facing interface is a dense, detailed beastie. Presumably, the smartphone interface will be stripped down to fit the smartphone screen real estate. Support for Microsoft’s business intelligence tools is included.

Why’s this important in search?

My research indicates that role-based interfaces may be one of Microsoft’s weapons as it tries to expand the market for its different enterprise systems. Applied to search, each user would “see” an interface and search results tailored to his or her role. This personalization of the system allows Microsoft to shift from a one-size-fits-all interface to a more specialized approach to a complex system.

With announcements about the integration of Fast Search & Transfer with Microsoft’s own search technology, there is little hard information about role-based interfaces available. In my opinion, competitors can offer similar functionality if the feature gets traction with customers.

Oh, the other products in the Dynamics line up are Dynamics NAV, Dynamics GP and Dynamics SL. I have difficulty keeping each straight in my mind. Microsoft’s preference for multiple versions of products like five flavors of Vista, SharePoint’s ESS and MOSS, and four ERP systems sends me to Google’s Microsoft search here to keep track of the differences. I rely on Google to locate Microsoft information. Response seems quicker and the index appears to be refreshed more frequently.

Stephen Arnold, June 6, 2008

SolveIT: Fancy Math

June 5, 2008

Several years ago I found myself in a meeting. I was paid to attend a session in North Carolina; otherwise, I wouldn’t go to Charlotte. The city is too sophisticated for this Kentuckian.

In the meeting, a soft-spoken mathematician, his son, a couple of cousins, and maybe an uncle explained sparse sets, assigning probabilities to boundaries, and ant algorithms.  As I struggled to dredge definitions about these concepts from my admittedly poor memory, the soft-spoken mathematician asked me a word problem. A waiter had 12 customers and ended up with an extra dollar. Why? I just sat there and looked my normal stupid self.

Later, he explained that his inspiration was a mathematician named Stanis?aw Le?niewski. Okay, early 20th century wizard. That was the end of my knowledge. Puzzles are the key to learning math he told me. In his spare time, this fellow has set up a Web site to make this concept more widely known. You can see it here.

I had no clue who these fellows were, but I was getting paid to listen so I listened.

A Super Guru: Who Says He’s Just a Regular Guy

The super guru is a fellow named Zbigniew Michalewicz, a highly regarded mathematician everywhere except in Harrod’s Creek. The relatives were also mathematicians. The crowd could finish one another’s sentences and equations. Math, it turns out, is something that runs in the Michalewicz family and has for decades.

Dr. Michalewicz is an expert in generic algorithms and data structures. When added together, the mathematical recipe yield evolution programs. You can read more about this approach to some tough data problems in Genetic Algorithms + Data Structures = Evolution Programs, published by Springer-Verlag ISBN: 3-540-60676-9. No, your local book store won’t stock it. Amazon does.

The group sold its US enterprise and Dr. Michalewicz and a family member or two moved to Australia.

After losing track of these fellows, I learned that Dr. Michalewicz, his son, and a handful of mathematical gurus set up shop as SolveIT Software. Click here to navigate to the company’s Web site.

The new company uses new math to solve old problems. The company is in the business of delivering solutions that deliver “adaptive business intelligence”. The company’s range of technology is remarkable and it may be meaningless to you unless you took a couple of advanced math classes; for example:

  • Agent-based systems
  • Ant systems (my favorite)
  • Evolutionary strategies
  • Evolutionary programming
  • Fuzzy systems
  • Genetic algorithms
  • Neural networks
  • Rough sets (great stuff!)
  • Swarm intelligence
  • Simulated annealing (does with math to data what oil quenching does to low-grade steel)
  • Tabu search (I have no clue what this numerical method yields).

You can figure out most of these notions by dipping into Peter Norvig’s Artificial Intelligence or E. J Borowski’s and J. M. Borwein’s Web-Linked Dictionary Mathematics. (Note: there is a subtle difference between the Norvig approach and the Michalewicz method. Google uses humans. Humans play an optional role in the Michalewicz recipes. No big deal, but you can explore the differences yourself by reading each guru’s text book.)

A Case Example

Equations are not likely to raise my Google ranking. Let me describe an outcome of Dr. Michalewicz’s skills.

Here’s the set up. You are Ford, Honda, or Toyota. Each week you get a couple of thousand lease cars back. You want to sell the cars quickly. You want to minimize how much you have to spend to truck these white elephants to a location where a particular model will sell. Pink convertibles don’t fly in Nome, Alaska, but are hot items in Scottsdale, Arizona. Your resale team would rather go to a bowling convention that work Excel models.

You want to maximize return, minimize expenses, and get the decisions out of your resale team’s “instinct” and into something fungible like a SolveIT solution.

SolveIT’s analysts beaver their way through the data, the work flow, and the exogenous factors that you and your resale team did not consider. The company builds from its mathematical Lego blocks, a computerized system that prints out a map and report telling your sales team where to ship which car.

You use the SolveIT system for a couple of months, and you notice that your expenses go down and your net goes up. SolveIT removed the guess work and let the “fancy math” do the heavy lifting. When I spoke with the company several years ago, one beta client was generating cash positives in six figures within six weeks.

Like most sophisticated companies run by serious math geeks, there’s not much information available on the company’s Web site. I did dig through my files, and I found an example of the company’s outputs. Keep in mind that this diagram is probably out of date, but it will give you a flavor of what the SolveIT operation does.

The system “shows” the resale team where certain cars will sell. Then the system prints out a report that says, “Send the pink convertible to Chicago and the truck to Paducah.” The math does the heavy lifting. The resale team looks at simple diagrams. The math remains safely hidden away.

solveit optimizer

Observations

SolveIT is one of a handful of companies pushing the envelope in analytics. If you want to tap into some serious math, contact this company. I have one tip. Don’t ask, “How does this work?” The explanation requires a solid foundation if traditional mathematics and post-doctoral work in set theory. How complicated is the math. I found in my files one example which I had to scan and convert to an image. I kept it as a reminder of how little I know about the next big things in mathematics; for example, in my notes I had this pair of statements:

If these statements speak to you, then you can dig more deeply into the SolveIT systems and methods.

Based on my personal experience with Dr. Michalewicz, he’s a capable mathematical thinker. For more about his company’s approach to problem solving, you will find useful How to Solve It: Modern Heuristics, also by Springer Verlag. You can get a copy here.

Stephen Arnold, June 6, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta