Amazon’s Implicit Metadata
February 18, 2009
Amazon is interesting to me because the company does what Google wants to do but more quickly. The other facet of Amazon that is somewhat mysterious is how the company can roll out cloud services with a smaller research and development budget than Google’s. I have not thought much about the A9 search engine since Udi Manber left to go to Google. The system is not too good from my point of view. It returns word matches but it does not offer the Endeca-style guided navigation that some eCommerce sites find useful for stimulating sales.
Intranet Insights disclosed here that Amazon uses “implicit metadata” go index Amazon content. I can’t repeat the full list assembled by Intranet Insights, and I urge you to visit that posting and read the article. I can highlight three examples of Amazon’s “implicit metadata” and offer a couple of observations.
Implicit metadata means automatic indexing. Authors or subject matter experts can manually assign index terms or interact with a semi-automated system such as that available from Access Innovations in Albuquerque, New Mexico. But humans get tired or fall into a habit of using a handful of common terms instead of consulting a controlled term list. Software does not get tired and can hit 90 percent accuracy once properly configured and resourced. Out of the box, automated systems hit 70 to 75 percent accuracy. I am not going to describe the methods for establishing these scores in this article.
Amazon uses, according to Intranet Insights:
- Links to and links from, which is what I call the Kleinberg approach made famous with Google’s PageRank method
- Author’s context; that is, on what “page” or in what “process” was the author when the document or information object was created. Think of this as a variation of the landing page for an inbound link or an exit page when a visitor leaves a Web site or a process
- Automated indexing; that is, words and phrases.
The idea is that Amazon gathers these data and uses them as metadata. Intranet Insights hints that Amazon uses other information as well; for example, comments in reviews and traffic analysis.
The Amazon system puzzles me when I run certain queries. Let me give some examples of problems I encounter:
- How can one search lists of books created by users? These exist, but for me the unfamiliar list is often difficult to locate and I cannot figure out how to find a particular book title on lists in that Amazon function. Why would I want to do this? I have a title but I want to see other books on lists on which the title appears. If you know how to run this query, shoot me the search string so I can test it.
- How can I filter a results list to eliminate books that are not yet published from books that are available? This situation arises when looking for Kindle titles using the drop downs and search function for titles in the Kindle collection. The function is available because the “recommendations” segment forthcoming titles from available titles, but the feature eludes me in the Kindle subsite.
- How can I run a query and see only the reviews by specific reviewers? I know that a prolific reviewer is Robert Steele. So, I want to see the books he has reviewed and I want to see the names of other reviewers who have reviewed a specific title that Mr. Steele has reviewed.
Amazon’s search system like the one Google provides for Apple.com is a frustrating experience for me. Amazon has lost sight of some of the basic principles of search; namely, if you have metadata tags, a user should be able to use these to locate the information in the public index.
This is not a question of implicit or explicit metadata. Amazon, like Apple, is indifferent to user needs. The focus is on delivering a minimally acceptable search service to satisfy the needs of the average user for the express purpose of moving the sale along. The balance I believe is between system burden and generating revenue. Amazon search deserves more attention in my opinion.
Stephen Arnold, February 18, 2009