IBM Generates Text Mining Work Flow Diagram

January 4, 2016

I read “Deriving Insight Text Mining and Machine Learning.” This is an article with a specific IBM Web address. The diagram is interesting because it does not explain which steps are automated, which require humans, and which are one of those expensive man-machine processes. When I read about any text related function available from IBM, I think about Watson. You know, IBM’s smart software.

Here’s the diagram:

image

If you find this hard to read, you are not in step with modern design elements. Millennials, I presume, love these faded colors.

Here’s the passage I noted about the important step of “attribute selection.” I interpret attribute selection to mean indexing, entity extraction, and related operations. Because neither human subject matter specialists nor smart software perform this function particularly well, I highlighted in red ink in recognition of IBM’s 14 consecutive quarters of financial underperformance:

Machine learning is closely related to and often overlaps with computational statistics—a discipline that also specializes in prediction-making. It has strong ties to mathematical optimization, which delivers methods, theory and application domains to the field. It is employed in a range of computing tasks where designing and programming explicit algorithms is infeasible. Example applications include spam filtering, optical character recognition (OCR), search engines and computer vision. Text mining takes advantage of machine learning specifically in determining features, reducing dimensionality and removing irrelevant attributes. For example, text mining uses machine learning on sentiment analysis, which is widely applied to reviews and social media for a variety of applications ranging from marketing to customer service. It aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgment or evaluation, affective state or the intended emotional communication. Machine learning algorithms in text mining include decision tree learning, association rule learning, artificial neural learning, inductive logic programming, support vector machines, Bayesian networks, genetic algorithms and sparse dictionary learning.

Interesting, but how does this IBM stuff actually work? Who uses it? What’s the payoff from these use cases?

More questions than answers to explain the hard to read diagram, which looks quite a bit like a 1998 Autonomy graphic. I recall being able to read the Autonomy image, however.

Stephen E Arnold, December 30, 2015

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta