News Flash: Data Mining May Not Be an Information Cure All

May 7, 2010

Technology can work wonders. Technology is supposed to make it easier for downsized organizations to perform with agility and alacrity. I am “into” technology but I understand that the minimum wage workers at airline counters and financial institutions operate within systems assumed to work as intended. These systems, in my opinion, neither work at the level of answering a simple question like “Is the flight on time?” or at more a sophisticated level of “Where did this wire transfer come from?”

Why is it a surprise that technology does not do less familiar tasks with glitches or outright breakdowns? I was surprised to read “NY Plot Highlights Limitations of Data Mining.” There were three reasons:

  1. The writer for Network World expresses gentle surprise that predictive systems don’t work too well when applied to the actions of one person. Network World documents lots of system glitches, and the gentle surprise is not warranted.
  2. The story plants the seed that we have no choice but to rely on fancy content processing systems. Are there other options? None if you rely on this article’s analysis. In my experience there are indeed options, but these are conveniently nudged to the margins.
  3. The dancing around with data mining is specious. Text processing is one of those Rube Goldberg machines just built with software. Get the assumptions wrong, the inputs wrong, or the algorithms wrong to a slight degree and guess what? The outputs are likely to be wrong.

Here’s the passage I found interesting:

That fact is likely to provide more fodder for those who question the effectiveness of using data mining approaches to uncover and forecast terror plots. Since the terror attacks of Sept. 11, the federal government has spent tens of millions of dollars on data mining programs and behavioral surveillance technologies that are being used by several agencies to identify potential terrorists. The tools typically work by searching through mountains of data in large databases for unusual patterns of activity, which are then used to predict future behavior. The data is often culled from dozens of sources including commercial and government databases and meshed together to see what kind of patterns emerge.

In my experience, humans and text processing must work in an integrated way. Depend only on technology and the likelihood of getting actionable information that is immediately useful goes down. Even Google asks humans to improve on its machine translation outputs. Smart software may not be so smart.

Stephen E Arnold, May 7, 2010

Unsponsored post.

Comments

2 Responses to “News Flash: Data Mining May Not Be an Information Cure All”

  1. News Flash: Data Mining May Not Be an Information Cure All | Digital Asset Management on May 7th, 2010 2:26 am

    […] News Flash: Data Mining May Not Be an Information Cure All : Beyond Search. […]

  2. M.AkramSaim12812 on June 6th, 2010 6:52 am

    I understand that the minimum wage workers at airline counters and financial institutions operate within systems assumed to work as intended.

  • Archives

  • Recent Posts

  • Meta