Taxonomy: Silver Bullet or Shallow Puddle

September 27, 2008

Taxonomy is hot. One of my few readers sent me a link to Fumsi, a Web log that contains a two part discussion of taxonomy. I urge you to read this post by James Kelway, whom I don’t know. You can find the article here. The write up is far better than most of the Webby discussions of taxonomies. After a quick pass at nodes and navigation, he jumps into information architecture requiring fewer than 125 words. The often unreliable Wikipedia discussion of taxonomy here chews up more than 6,000. Brevity is the soul of wit, and whoever contributed to the Wikipedia article must be SWD; that is, severely wit deprived.

Take a look at the Google Trends’ chart I generated at 8 pm on Friday, September 26, 2008. Not only is taxonomy generating more Google traffic than the now mud crawler enterprise search. Taxonomy is not as popular as “CMS”, the shorthand for content management system. But “taxonomy” is a specialist concept that seems to be moving into the mainstream. At the just concluded Information Today trifecta conference featuring search, knowledge management (whatever that is), and streaming media, taxonomy was a hot topic. At the Wednesday roof top cocktail, where I worked on my tan in the 90 degree ambient air temperature, I was asked four times about taxonomies. I know I worked on commercial taxonomies and controlled vocabularies for database, but I learned from those years of experience that taxonomies are really tough, demanding, time consuming intellectual undertakings. I thought I was pretty good at making logical, coherent lists. Then I met the late Betty Eddison and the very active Marje Hlava. These two pros taught me a thing or 50.

google trends taxnonomy

In the dumper is the red line which maps “enterprise search” popularity. The blue line is the up and coming taxonomy popularity. The top line is the really popular, yet hugely disappointing, content management term traffic.

I heard people who have been responsible for failed search systems and non functional content management systems asking, “Will a taxonomy improve our content processing?” The answer is, “Sure, if you get an appropriate taxonomy?” I then excuse myself and head to the bar man for a Diet 7 Up. The kicker, of course, is “appropriate”. Figuring out what’s appropriate and then creating a taxonomy that users will actually exploit directly or indirectly is tough work. But today, you can learn how to do a taxonomy in a 40 minute presentation or if you are really studious a full eight hour seminar.

I remember talking with Betty Eddison and Marje Hlava about their learning how to craft appropriate taxonomies. Marje just laughed and turned to her business partner who also burst out laughing. Betty smiled and in her deep, pleasant voice said, “A life time, kiddo.” She called me “kiddo”, and I don’t think anyone else ever did. Marje Hlava chimed in and added, “Well, Jay [her business partner] and I have been at it for two life times.” I figured out pretty quickly that building “appropriate” taxonomies required more than persistence and blissfully ignorant confidence.

Why are taxonomies perceived as the silver bullet that will kill the vampire search or CMS system. A vampire system is one that will suck those working on it into endless nights and weekends and then gobble available budget dollars. In my opinion, here are the top five reasons:

  1. The notion of a taxonomy as a quick fix is easy to understand. Most people think of a taxonomy as the equivalent of the Dewey Decimal system or the Library of Congress subject headings and think, “How tough can this taxonomy stuff be?” After a couple of runs at the problem, the notion of a quick fix withers and dies.
  2. Vendors of lousy enterpriser search systems wriggle off the hook by asserting, “You just need a taxonomy and then our indexing system will be able to generate an assisted navigation interface.” This is the search equivalent of “The check is in the mail.”
  3. CMS vendors, mired in sluggish performance, lost information, and users who can’t find their writings, can suggest, “A taxonomy and classification module makes it much easier to pinpoint the marketing collateral. If you search for a common term, our system displays those documents with that common term. Yes, a taxonomy will do the trick.” This is the same as “Let’s do lunch” repeated every week to a person whom you know but with whom you don’t want to talk for more than 30 seconds on a street corner in mid town Manhattan.
  4. A shill at a user group meeting–now called a “summit”–praises the usefulness of the taxonomy in making it easier for users to find information. Vendors work hard to get a system that works and win over the project manager. Put on center stage and pampered by the vendor’s PR crafts people, the star customer presents a Kodachrome version of the value of taxonomies. Those in the audience often swallow the tale the way my dog Tess goes after a hot dog that falls from the grill. There’s not much thinking in Tess’s actions either.
  5. Vendors of “automated” taxonomy systems demonstrate how their software chops a tough problem down to size in a matter of hours or days. Stuff in some sample content and the smart algorithms do the work of Betty Eddison and Marje Hlava in a nonce. Not on your life, kiddo. The automated systems really are 100 percent automatic. The training corpus is tough to build. The tuning is a manual task. The smart software needs dummies like me to fiddle. Even more startling to licensees of automatic taxonomy systems is that you may have to buy a third party tool from Access Innovations, Marje Hlava’s company, to get the job done. That old phrase “If ignorance is bliss, hello, happy” comes to mind when I hear vendors pitch the “automated taxonomy” tale.

I assume that some readers may violently disagree with my view of 21st century taxonomy work. That’s okay. Use the comments section to teach this 65 year old dog some new tricks. I promise I will try to learn from those who bring hard data. If  you make assertions, you won’t get too far with me.

Stephen Arnold, September 27, 2008

Comments

5 Responses to “Taxonomy: Silver Bullet or Shallow Puddle”

  1. George Everitt on September 27th, 2008 1:27 am

    I’ve been wading into the taxonomy puddle lately. Having dealt with “automatic” taxonomy generation for 10 years from Verity and Autonomy, i can tell you that there is no such thing. Not unless you think that categories like “automobile+jackrabbit” are valuable.
    I spent two months of my life that I’ll never get back in Mississauga, Ontario building taxonomy rules to discern the difference between a “clinical trial” and a “pre-clinical trial”. Neither man nor machine is suited to that madness.
    That being said, a nice rule-based taxonomy with a few hundred rules is not too hard for a mortal librarian to manage, as long as she doesn’t mind overflowing some nodes.

  2. David Eddy on September 27th, 2008 4:21 pm

    Taxonomies are so yesterday… hasn’t word reached you that ONTOLOGIES are THE answer?

  3. CJ on February 7th, 2009 4:44 am

    David Eddy, I second that.

  4. Fran on February 21st, 2009 12:36 pm

    >>ONTOLOGIES are THE answer? Maybe, but what is an ontology built on? Taxonomies.

    If you can’t sort yourself decent taxonomies, forget about creating a useful ontology.

  5. Sheri’s Links for April 1st | Sheri Larsen’s Flying Cloud on April 1st, 2009 2:31 pm

    […] Taxonomy: Silver Bullet or Shallow Puddle – From Beyond Search. […]