IBM Generates Text Mining Work Flow Diagram
January 4, 2016
I read “Deriving Insight Text Mining and Machine Learning.” This is an article with a specific IBM Web address. The diagram is interesting because it does not explain which steps are automated, which require humans, and which are one of those expensive man-machine processes. When I read about any text related function available from IBM, I think about Watson. You know, IBM’s smart software.
Here’s the diagram:
If you find this hard to read, you are not in step with modern design elements. Millennials, I presume, love these faded colors.
Here’s the passage I noted about the important step of “attribute selection.” I interpret attribute selection to mean indexing, entity extraction, and related operations. Because neither human subject matter specialists nor smart software perform this function particularly well, I highlighted in red ink in recognition of IBM’s 14 consecutive quarters of financial underperformance:
Machine learning is closely related to and often overlaps with computational statistics—a discipline that also specializes in prediction-making. It has strong ties to mathematical optimization, which delivers methods, theory and application domains to the field. It is employed in a range of computing tasks where designing and programming explicit algorithms is infeasible. Example applications include spam filtering, optical character recognition (OCR), search engines and computer vision. Text mining takes advantage of machine learning specifically in determining features, reducing dimensionality and removing irrelevant attributes. For example, text mining uses machine learning on sentiment analysis, which is widely applied to reviews and social media for a variety of applications ranging from marketing to customer service. It aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgment or evaluation, affective state or the intended emotional communication. Machine learning algorithms in text mining include decision tree learning, association rule learning, artificial neural learning, inductive logic programming, support vector machines, Bayesian networks, genetic algorithms and sparse dictionary learning.
Interesting, but how does this IBM stuff actually work? Who uses it? What’s the payoff from these use cases?
More questions than answers to explain the hard to read diagram, which looks quite a bit like a 1998 Autonomy graphic. I recall being able to read the Autonomy image, however.
Stephen E Arnold, December 30, 2015
Weekly Watson: In the Real World
January 2, 2016
I want to start off the New Year with look at Watson in the real world. My real world is circumscribed by abandoned coal mines and hollows in rural Kentucky. I am pretty sure this real world is not the real world assumed in “IBM Watson: AI for the Real World.” IBM has tapped Bob Dylan, a TV game show, and odd duck quasi chemical symbols to communicate the importance of search and content processing.
The write up takes a different approach. In fact, the article begins with an interesting comment:
Computers are stupid.
There you go. A snazzy one liner.
The purpose of the reminder that a man made device is not quite the same as one’s faithful boxer dog or next door neighbor’s teen is startling.
The article summarizes an interview with a Watson wizard, Steven Abrams, director of technology for the Watson Ecosystem. This is one of those PR inspired outputs which I quite enjoy.
The write up quotes Abrams as saying:
“You debug Watson’s system by asking, ‘Did we give it the right data?'” Abrams said. “Is the data and experience complete enough?”
Okay, but isn’t this Dr. Mike Lynch’s approach. Lynch, as you may recall, was the Cambridge University wizard who was among the first to commercialize “learning” systems in the 1990s.
According to the write up:
Developers will have data sets they can “feed” Watson through one of over 30 APIs. Some of them are based on XML or JSON. Developers familiar with those formats will know how to interact with Watson, he [Abrams] explained.
As those who have used the 25 year old Autonomy IDOL system know, preparing the training data takes a bit of effort. Then as the content from current content is fed into the Autonomy IDOL system, the humans have to keep an eye on the indexing. Ignore the system too long, and the indexing “drifts”; that is, the learned content is not in tune with the current content processed by the system. Sure, algorithms attempt to keep the calibrations precise, but there is that annoying and inevitable “drift.”
IBM’s system, which strikes me as a modification of the Autonomy IDOL approach with a touch of Palantir analytics stirred in is likely to be one expensive puppy to groom for the dog show ring.
The article profiles the efforts of a couple of IBM “partners” to make Watson useful for the “real” world. But the snip I circled in IBM red-ink red was this one:
But Watson should not be mistaken for HAL. “Watson will not initiate conduct on its own,” IBM’s Abrams pointed out. “Watson does not have ambition. It has no objective to respond outside a query.” “With no individual initiative, it has no way of going out of control,” he continued. “Watson has a plug,” he quipped. It can be disconnected. “Watson is not going to be applied without individual judgment … The final decision in any Watson solution … will always be [made by] a human, being based on information they got from Watson.”
My hunch is that Watson will require considerable human attention. But it may perform best on a TV show or in a motion picture where post production can smooth out the rough edges.
Maybe entertainment is “real”, not the world of a Harrod’s Creek hollow.
Stephen E Arnold, January 2, 2016
IBM: There Are Doubters
December 31, 2015
Watson has its works cut out for itself in 2016. I read “IBM Set to Drop 13% in 2015.” When one is tossing around a $100 billion outfit, the thought of a share drop is disconcerting. Will Alibaba or Jeff Bezos step in. Fixing up the Washington Post may be trivial compared with an IBM scale challenge.
According to the write up:
Much of the disappointment in the tech company is because it has been unable to replace its hardware and software legacy products with new cloud-based and AI products — at least not at a rate that would pull IBM’s revenue up. Its major branded product in new age technology is Watson. While Watson has been the source of press releases and small customer alliances, outsiders have trouble seeing what it does to sharply increase IBM’s sales. Granted, Watson may be one of the most impressive product advances among large companies in the sector recently, but what it does for IBM may be very modest.
Somewhat of a downer I perceive.
The smart software thing is not new. In the last 18 months, awareness of the use of various numerical recipes has increased. Faster chips, memories, and interconnections have worked their magic.
The challenge for IBM is to make money, not just marketing hyperbole. The crunch is that expectations for certain technologies are often more robust than possible in a market.
Watson is, when one keeps its eye on the ball, is a search and content processing system. The wrappers make it possible to call assorted functions. Unlike Palantir, which has its own revenue fish to catch, IBM is a publicly traded company. Palantir does its magic as a privately held company, ingesting money at rates which would make beluga whale’s diet look modest.
But IBM has exposed itself. The Watson marketing push is dragged into the reality of IBM’s overall company performance. In 2016, IBM Watson will have to deliver the bacon, or some of the millennialesque PR and marketing folks will have an opportunity to work elsewhere. Talk about smart software is not generating sustainable revenue from smart software.
Stephen E Arnold, December 31, 2015
Another Good Reason for Diversity in Tech
December 29, 2015
Just who decides what we see when we search? If we’re using Google, it’s a group of Google employees, of course. The Independent reports, “Google’s Search Results Aren’t as Unbiased as You Think—and a Lack of Diversity Could Be the Cause.” Writer Doug Bolton points to a TEDx talk by Swedish journalist Andreas Ekström, in which Ekström describes times Google has, and has not, counteracted campaigns to deliberately bump certain content. For example, the company did act to decouple racist imagery from searches for “Michelle Obama,” but did nothing to counter the association between a certain Norwegian murderer and dog poop. Boldon writes:
“Although different in motivation, the two campaigns worked in exactly the same way – but in the second, Google didn’t step in, and the inaccurate Breivik images stayed at the top of the search results for much longer. Few would argue that Google was wrong to end the Obama campaign or let the Breivik one run its course, but the two incidents shed light on the fact that behind such a large and faceless multi-billion dollar tech company as Google, there’s people deciding what we see when we search. And in a time when Google has such a poor record for gender and ethnic diversity and other companies struggle to address this imbalance (as IBM did when they attempted to get women into tech by encouraging them to ‘Hack a Hairdryer’), this fact becomes more pressing.”
The article notes that only 18 percent of Google’s tech staff worldwide are women, and that it is just two percent Hispanic and one percent black. Ekström’s talk has many asking what unperceived biases lurk in Google’s algorithms, and some are calling on the company anew to expand its hiring diversity. Naturally, though, any tech company can only do so much until more girls and minorities are encouraged to explore the sciences.
Cynthia Murrell, December 29, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Watson Weekly: What? No Meth?
December 28, 2015
I read “What Can I eat in Pregnancy? App Aims to Answer with Help from IBM’s Watson?” Consider that folks with smartphones constitute a modest percentage of the world population. Health, as I understand the fine outputs of my health care providers, depends on socio-economic background. Also, a person with access to the Google can find out what foods are okay to eat when pregnant. Sure, there are ads, but Google presents reasonably useful information when one queries, “pregnant mother diet.” No app is needed for this type of research. Heck, one can just ask around.
Nevertheless, the potent cognitive computing outfit powered by the question answering Watson is delivering pregnant person diet advice via a smartphone app. There’s an app for that remains an accurate statement.
The write up points out:
The Nutrino app, powered by IBM’s supercomputer Watson, claims to guide women through pregnancy. For $15 (£10) for the duration of pregnancy, the app gives personalized meal recommendations and nutritional support by combining Nutrino’s nutrition database and Watson’s natural language capabilities.
If one is a pregnant and the owner of a smartphone, the price for the app is no problem. The pace of IBM innovation never slows. Now about the pregnant folks in Umgababa, South Africa? That’s in the DwaZulu Natal province.
The write up points out:
Nutrino is likely to appeal to women who already track their diet and exercise. The fitbit and Apple Watch generation may prefer to get their information about pregnancy by talking to their wrist rather than chatting to their mums. But even Watson may struggle to provide the common sense and personal experience that complements scientific knowledge.
“Struggle.” Yep, I would say that is a good word.
Stephen E Arnold, December 28, 2015
IBM Watson Competes for the Artificial Intelligence Crown
December 21, 2015
The article titled IBM Watson Vs. Amazon: Machine Learning Systems Presage the Future on Datamation dukes it out between IBM’s famous supercomputer and the Amazon Web Services platform. Both are at the forefront of the industry, but which is best? Unsurprisingly, the article offers no definitive answer beyond: it depends what you are using them for. The article states,
“Amazon offers a simplified platform for developers who want to start working with machine learning without a lot of stress or specialized tools or investment… What IBM is trying to establish with the Watson analytics engine is not just storing and acquiring data, but taking all that information and doing something meaningful with it as an AI service or Intelligence as a Service.”
Jack Gold, Principal Analyst for J.Gold Associates, emphasizes that the larger point is that the AI technologies these two companies are competing to lead will shortly be much more far-spread due to the ever increasing amounts of data. The article also discusses some of the more exciting uses of Watson and Amazon. The former, through a company called Fluid, is being put to use in the retail industry relying on Watson’s ability to “read” customer personalities (with his handy personality matrix). Amazon Machine Learning, in the meanwhile, has recently been used for predictive modeling of job-cost estimates for insurance companies and builders.
Chelsea Kerwin, December 21, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Watson Is Laying Startup Eggs
December 21, 2015
Incubators are warming stations for eggs. Without having to rely on an organism’s DNA donor, an incubator provides a warm, safe environment for the organism to develop, hatch, and eventually be ready to face the world. Watson has decided it is time for itself to propagate, but instead of knitting tiny computer cases Watson will invest its digital DNA in startups. The Chicago Tribune discusses Watson’s reproduction efforts and progeny in “Watson, IBM’s Big-Data Program Is Also A Startup Incubator.”
While IBM sells Watson’s ability to scan and understand terabytes of data, the company also welcomes developers to use Watson for new ideas. What is even more amazing is that IBM gives developers the ability to use Watson for free for a limited time.
“In Ecosystem, everyone is invited to play with Watson for free (for a limited time); some 77,000 developers have accepted. If your Watson-powered startup shows promise, it becomes a “partner,” often via a quasi-incubator model, and enjoys access to IBM business and technology advisers–and a shot at a capital infusion from the $100 million IBM is making available to Watson startups…”
Ecosystem has been used for startups that feature lifestyle coaching, personal shopping, infrastructure guards, veterinarian advice, fantasy sports calculator, 311 information, and even a hotel butler.
To quote the biblical justification for propagation: “Go forth and multiply the [Watson startups].”
Whitney Grace, December 21, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Weekly Watson: The Internet of Things
December 17, 2015
Yep, there is not a buzzword, trend, or wave which IBM’s public relations professionals ignore. I read “IBM Is Bringing Its Watson Supercomputer to IoT.” The headline puzzled me. I thought that Watson was:
- Open source software like Lucene
- Home brew scripts
- Acquired technology.
The hardware part is moving to the cloud. IBM is reveling in a US government supercomputing contract which may involve quantum computing.
But Watson runs on hardware. If Watson is a supercomputer, I see some parallels with the Google and Maxxcat search appliances.
The write up reports:
IBM has announced today it is bringing the power of its Watson supercomputer to the Internet of Things, in a bid to extend the power of cognitive computing to the billions of connected devices, sensors and systems that comprise the IoT.
Will the Watson Internet of Things be located in Manhattan? Nope. I learned:
the company announced that the new initiative, the Watson Internet of Things, will be headquartered in Munich, Germany. The facility will serve as the first European Watson innovation super centre, built to drive collaboration between IBM experts and clients. This will be complemented by eight Watson IoT Client Experience Centers spread across Europe, Asia and the Americas.
Why Germany? IBM has a partner, Siemens.
Will the IoT venture use the shared desk approach. According to EndicottAllilance.org Comment 12/10/15, this approach to work has some consequences:
I wouldn’t get too excited about the new “Agile Workspace” in RTP. Basically it is management forcing workers back to the office and into a tense, continuously monitored environment with no privacy. It will be loud, you’ll have no space of your own, and it will be difficult to think. Mood marbles? Better be sure you always choose the light-colored ones! And make sure your discussion card is always flipped to the green side. What humiliation! The environment will be great for loud-mouthed managers, terrible for workers who do all the work. Worse than cubicles.
From cookbooks to cancer, IBM Watson seems to be where the buzzwords are. I wonder if the Watson revenues will reverse the revenue downturns IBM has experienced for 14 consecutive quarters.
Stephen E Arnold, December 17, 2015
Old School Mainframes Still Key to Big Data
December 17, 2015
According to ZDNet, “The Ultimate Answer to the Handling of Big Data: The Mainframe.” Believe it or not, a recent survey of 187 IT pros from Syncsort found the mainframe to be the important to their big data strategy. IBM has even created a Hadoop-capable mainframe. Reporter Ken Hess lists some of the survey’s findings:
*More than two-thirds of respondents (69 percent) ranked the use of the mainframe for performing large-scale transaction processing as very important
*More than two-thirds (67.4 percent) of respondents also pointed to integration with other standalone computing platforms such as Linux, UNIX, or Windows as a key strength of mainframe
*While the majority (79 percent) analyze real-time transactional data from the mainframe with a tool that resides directly on the mainframe, respondents are also turning to platforms such as Splunk (11.8 percent), Hadoop (8.6 percent), and Spark (1.6 percent) to supplement their real-time data analysis […]
*82.9 percent and 83.4 percent of respondents cited security and availability as key strengths of the mainframe, respectively
*In a weighted calculation, respondents ranked security and compliance as their top areas to improve over the next 12 months, followed by CPU usage and related costs and meeting Service Level Agreements (SLAs)
*A separate weighted calculation showed that respondents felt their CIOs would rank all of the same areas in their top three to improve
Hess goes on to note that most of us probably utilize mainframes without thinking about it; whenever we pull cash out of an ATM, for example. The mainframe’s security and scalability remain unequaled, he writes, by any other platform or platform cluster yet devised. He links to a couple of resources besides the Syncsort survey that support this position: a white paper from IBM’s Big Data & Analytics Hub and a report from research firm Forrester.
Cynthia Murrell, December 17, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Weekly Watson: Quotes about Trend App
December 11, 2015
I know. I know. I keep writing about IBM Watson. The product, service, marketing behemoth is the future of IBM. I think it is even though another IBM Watson top dog headed for the greener grass at GE. (Read about the jump in “GE Poaches Key IBM Watson Exec.” I like the “key” word.)
My highlight of the IBM Watson week is a story in Customer Think. I don’t pay much attention to the customer think arena, but I saw a link to Watson and gave the mouse a poke.
Bull’s eye. The write up is an interview under the SEO friendly title “IBM Chief Strategist for Watson Trend App Answers 4 Questions for Marketing Innovators.”
Yikes. Am I allowed to read this interview. I am not a marketing innovator and my questions usually go unanswered; for example, what happened to the goal of $10 billion in Watson revenue in five years? See what I mean.
Here are three highlights from the interview with Justin Norwood, who carries a significant burden with the Watson as a consumer app play.
Mr. Norwood is an innovator. Here’s his response to a question about why the Watson Trend App is important:
I believe that cognitive computing – of which IBM Watson is the leading example – is the missing link to making mass personalization a reality.
Belief, like hope and faith, is a useful characteristic. I personally put a bit more emphasis on revenue. But that’s what makes horse races.
A second quote I circle in Big Blue blue is:
The app has already improved my personal gift giving experience.
Yes, hands on personal testimony. I wonder what the recipients of the gifts have to say. Did the app really hit the gift on the head or were the recipients (Mr. Norwood’s daughters) telling anyone who would listen what they wanted from Amazon, an outfit with some recommendation technology that works okay. It produces revenue by the way.
The third and final gem I circled in red ink red (my Big Blue blue marker gave out on me):
I am also very motivated to see an end to hunger and malnutrition in Africa in my lifetime, so I recently partnered with an organization called Seeds of Action to work towards that.
Good idea. Imagine how much money can be routed to help folks in Africa if IBM Watson generates the much needed billions IBM management presaged a couple of years ago. By the way, there are hungry folks in the USA as well.
Stephen E Arnold, December 11, 2015