Lawyers and Metadata

January 8, 2009

Now the indexing world gets something to gnaw on. Automated indexing systems beat out humans when measured by cost per item indexed, speed, and consistency. Automated indexing systems can be as good as a human for some types of content. But humans are variably bad at indexing. Software hits a sweet spot and doesn’t get significantly better or worse unless the content throws in a wrench. Now the issue of not providing metadata arises. We can automate the creation of metadata, but it is early days in the world of automatic metadata scrubbing. I quacked happily when I thought, “I wonder who knows where their metadata are?”

Jim Calloway’s “Metadata–What Is It and Waht Are My Ethical Duties” here breathes new life into human indexing. What I find interesting is that lawyers charge by the hour. Human indexes are paid by piece work schedules or given a flat year fee and maybe some benefit crumbs. The economics of human indexing is based on keeping the per record cost as low as possible whilst one maintains the “quality” of the indexing. “Quality” in the commercial database world is often defined as a metric such as “four to six index terms per bibliographic record” or “16 records per hour with required fields completed”. You may have a more academic definition, but my examples come from the soon-to-be-marginalized world of human commercial database production.

The article defines metadata in terms of a legal eagle, of course. But the story gets interesting when Mr. Calloway cites a sitaution in which metadata became a legal issue. Where there is a legal issue, there is the risk of a fine, jail, or losing pride of place among the brood of legal eagles. Forget the compensation. Ego may be a bigger force in the legal eagle world. Mr. Calloway nicely hooks metadata with risk.

For me, the most important comment in this useful write up was:

In this writer’s view, the key is to avoid sending out documents with metadata that could disclose confidential information. Comparing metadata to a wrongly sent fax or e-mail is questionable and the idea that lawyers will be prohibited from examining metadata while parties, law enforcement officers and private detectives will be free to do so seems artificial at best. The Colorado rule that one must disclose receiving confidential information via metadata before acting on it seems to strike a rational balance. The best rule is for law firms to develop best practices internally to keep metadata from “escaping” in the first place.

I quite like “keep metadata from escaping in the first place”. To close, let me ask several questions:

  • Do you know why metadata are in the documents available for indexing on your Web site
  • Do you know how value added indexing in a dataspace can expand the access to a document in an often unrelated context
  • Do you know where metadata are in a document, in a Web page or other containing housing the document, or in the dataspace created for the information objects?

If not, you will want to dig up this information yourself. Asking your attorney will result in a very large legal bill. One final question: Do you think Mr. Madoff knows about his metadata?

Stephen Arnold, January 8, 2009


Comments are closed.

  • Archives

  • Recent Posts

  • Meta