The IBM Watson Hype Machine Shouts Again

October 28, 2016

The IBM Watson semi news keeps on flowing. The PR firms working with IBM and the Watson team may bring back the go go days of Madison Avenue. Note, please. I wrote “may.” IBM’s approach, in my opinion, is based on the Jack Benny LSMFT formula. Say the same thing again and again and pretty soon folks will use the product. The problem is that IBM has not yet found its Jack Benny. Bob Dylan, the elusive Nobel laureate, is not exactly the magnetic figure that Mr. Benny was.

For a recent example of the IBM Watson buzz-o-rama, navigate to “IBM Watson: Not So Elementary.” I know the story is important. Here’s the splash page for the write up:

image

I will definitely be able to spot this wizard if I bump into him in Harrod’s Creek, Kentucky. I wonder what the Watson expert is looking at or for. Could it be competitors like Facebook or outfits in the same game in China and Russia?

The write up begins with an old chestnut: IBM’s victory on Jeopardy. No more games. I learned:

IBM’s cognitive computing system is through playing games. It’s now a hired gun for thousands of companies in at least 20 industries.

I like the “hired” because it implies that IBM is raking in the dough from 20 different industry sectors. IBM, it seems, is back in the saddle. That is a nifty idea but for the fact that IBM reported its 18th consecutive quarter of revenue declines. The “what if” question I have is, “If Watson were generating truly big bucks, wouldn’t that quarterly report reflect a tilt toward positive revenue growth?” Bad question obviously. The Fortune real journalist did not bring it up.

The write up is an interview. I did highlight three gems, and I invite—nay, I implore—you to read and memorize every delicious word about IBM Watson. Let’s look at the three comments I circled with my big blue marker.

Augmented Intelligence

at IBM, we tend to say, in many cases, that it’s not artificial as much as it’s augmented. So it’s a system between machine computing and humans interpreting, and we call those machine-human interactions cognitive systems. That’s kind of how it layers up….it’s beginning to learn on its own—that is moving more in the direction of what some consider true artificial intelligence, or even AGI: artificial general intelligence.

Yikes, Sky Net on a mainframe, think I.

Training Watson

there isn’t a single Watson. There’s Watson for oncology. There’s Watson for radiology. There’s Watson for endocrinology…for law…for tax code…for customer service.

I say to myself, “Wow, the costs of making each independent Watson smart must be high. What if I need to ask a question and want to get answers from each individual Watson? How does that work? How long does it take to receive a consolidated answer?  What if the customer service Watson gets a question about weather germane to an insurance claim in South Carolina?”

The Competition

The distinctness of the Watson approach has been to create software that you can embed in other people’s applications, and these are especially used by the companies that don’t feel comfortable putting their data into a single learning system—particularly one that’s connected to a search engine—because in effect that commoditizes their intellectual property and their cumulative knowledge. So our approach has been to create AI for private or sensitive data that is best reserved for the entities that own it and isn’t necessarily ever going to be published on the public Internet.

I ponder this question, “Will IBM become the background system for the competition?” My hunch is that Facebook, Google, Microsoft, Amazon, and a handful of outfits in backwaters like Beijing and Moscow will think about non IBM options. Odd that the international competition did not come up in the Fortune interview with the IBM wizard.

End Game

these systems will predict disease progression in time to actually take preventive action, which I think is better for everybody.

“Amazing, Watson will intervene in a person’s life,” blurt my Sky Net sensitive self.

Please, keep in mind that this is an IBM Watson cheer which is about 4,000 words in length. As you work through the original Fortune article, keep in mind:

  • The time and cost of tuning a Watson may cost more than a McDonald’s fish sandwich
  • The use of “augmented intelligence” is a buzzword embraced by a number of outfits, including Palantir Technologies, a competitor to IBM in the law enforcement and intelligence community. Some of IBM’s tools are ones which the critics of the Distributed Common Ground System suggest are difficult to learn, maintain, and use. User friendly is not the term which comes to mind when I think of IBM. Did you configure a mainframe or try to get a device driver for OS/2 to work? There you go.
  • The head of IBM Watson is not an IBM direct hire who rose through the ranks. Watson is being guided by a person from the Weather Channel acquisition.

How does Watson integrate that weather data into queries? How can a smart system schedule surgeries when the snow storm has caused traffic jams. Some folks may use an iPhone or Pixel or use common sense.

Stephen E Arnold, October 28, 2016

Picking Away at Predictive Programs

October 21, 2016

I read “Predicting Terrorism From Big Data Challenges U.S. Intelligence.” I assume that Bloomberg knows that Thomson Reuters licenses the Palantir Technologies Metropolitan suite to provide certain information to Thomson Reuters’ customers. Nevertheless, I was surprised at some of the information presented in this “real” journalism write up.

The main point is that numerical recipes cannot predict what, when, where, why, and how bad actors will do bad things. Excluding financial fraud, which seems to be a fertile field for wrong doing, the article chases the terrorist angle.

I learned:

  • Connect  the dots is a popular phrase, but connecting the dots to create a meaningful picture of bad actors’ future actions is tough
  • Big data is a “fundamental fuel”
  • Intel, PredPol, and Global Intellectual Property Enforcement Center are working in the field of “predictive policing”
  • The buzzword “total information awareness” is once again okay to use in public

I highlighted this passage attributed too a big thinker at the Brennan Center for Justice at NYU School of Law:

Computer algorithms also fail to understand the context of data, such as whether someone commenting on social media is joking or serious,

Several observations:

  • Not a single peep about Google Deep Mind and Recorded Future, outfits which I consider the leaders in the predictive ball game
  • Not a hint that Bloomberg was itself late to the party because Thomson Reuters, not exactly an innovation speed demon, saw value in Palantir’s methods
  • Not much about what “predictive technology” does.

In short, the write up delivers a modest payload in my opinion. I predict that more work will be needed to explain the interaction of math, data, and law enforcement. I don’t think a five minute segment with talking heads on Bloomberg TV won’t do it.

Stephen E Arnold, October 21, 2016

Online and without Ooomph: Social Content

October 15, 2016

I am surprised when Scientific American Magazine runs a story somewhat related to online information access. Navigate to read “The Bright Side of Internet Shaming.” The main point is that shaming has “become so common that it might soon begin to lose its impact.” Careful wording, of course. It is Scientific American, and the write up has few facts of the scientific ilk.

I highlighted this passage:

…these days public shaming are increasingly frequent. They’ve become a new kind of grisly entertainment, like a national reality show.

Yep, another opinion from Scientific American.

I then circled in Hawthorne Scarlet A red:

there’s a certain kind of hope in the increasing regularity of shamings. As they become commonplace, maybe they’ll lose their ability to shock. The same kinds of ugly tweets have been repeated so many times, they’re starting to become boilerplate.

I don’t pay much attention to social media unless the data are part of a project. I have a tough time distinguishing misinformation, disinformation, and run of the mill information.

What’s the relationship to search? Locating “shaming” type messages is difficult. Social media search engines don’t work particularly well. The half hearted attempts at indexing are not consistent. No surprise in that because user generated input is often uninformed input, particularly when it comes to indexing.

My thought is that Scientific American reflects shaming. The write up is not scientific. I would have found the article more interesting if:

  • Data based on tweet or Facebook post analyses based on negative or “shaming” words
  • Facts about the increase or decrease in “shaming” language for some “boilerplate” words
  • A Palantir-type link analysis illustrating the centroids for one solid shaming example.

Scientific American has redefined science it seems. Thus, a search for science might return a false drop for the magazine. I will skip the logic of the write up because the argument strikes me as subjective American thought.

Stephen E Arnold, October 15, 2016

HonkinNews for October 4, 2016 Available

October 4, 2016

This week’s HonkinNews is available at this link. The feature story explores Palantir Technologies’ love-less love relationship with the US Army. Palantir’s approach to keeping its government customers happy is innovative. We also comment about Google’s blurring of cow faces in StreetView. Learn why SearchBlox is giving vendors of expensive, proprietary enterprise search systems cramps in their calves. Microsoft continues to pay users to access the Internet via Edge and use Bing to search for information. How much does the US government spend for operations and maintenance of its systems? The figure is surprising, if not shocking. This and more in HonkinNews for October 4, 2016.

Kenny Toth, October 4, 2016

Crimping: Is the Method Used for Text Processing?

October 4, 2016

I read an article I found quite thought provoking. “Why Companies Make Their Products Worse” explains that reducing costs allows a manufacturer to expand the market for a product. The idea is that more people will buy a product if it is less expensive than a more sophisticated version of the product. The example which I highlighted in eyeshade green explained that IBM introduced an expensive printer in the 1980s. Then IBM manufactured the different version of the printer using cheaper labor. The folks from Big Blue added electronic components to make the cheaper printer slower. The result was a lower cost printer that was “worse” than the original.

image

Perhaps enterprise search and content processing is a hybrid of two or more creatures?

The write up explained that this approach to degrading a product to make more money has a name—crimping. The concept creates “product sabotage”; that is, intentionally degrading a product for business reasons.

The comments to the article offer additional examples and one helpful person with the handle Dadpolice stated:

The examples you give are accurate, but these aren’t relics of the past. They are incredibly common strategies that chip makers still use today.

I understand the hardware or tangible product application of this idea. I began to think about the use of the tactic by text processing vendors.

The Google Search Appliance may have been a product subject to crimping. As I recall, the most economical GSA was less than $2000, a price which was relatively easy to justify in many organizations. Over the years, the low cost option disappeared and the prices for the Google Search Appliances soared to Autonomy- and Fast Search-levels.

Other vendors introduced search and content processing systems, but the prices remained lofty. Search and content processing in an organization never seemed to get less expensive when one considered the resources required, the license fees, the “customer” support, the upgrades, and the engineering for customization and optimization.

My hypothesis is that enterprise content processing does not yield compelling examples like the IBM printer example.

Perhaps the adoption rate for open source content processing reflects a pent up demand for “crimping”? Perhaps some clever graduate student would take the initiative to examine the content processing product prices? Licensees spend for sophisticated solution systems like those available from outfits like IBM and Palantir Technologies. The money comes from the engineering and what I call “soft” charges; that is, training, customer support, and engineering and consulting services.

At the other end of the content processing spectrum are open source solutions. The middle between free or low cost systems and high end solutions does not have too many examples. I am confident there are some, but I could identify Funnelback, dtSearch, and a handful of other outfits.

Perhaps “crimping” is not a universal principle? On the other hand, perhaps content processing is an example of a technical software which has its own idiosyncrasies.

Content processing products, I believe, become worse over time. The reason is not “crimping.” The trajectory of lousiness comes from:

  • Layering features on keyword retrieval in hopes of finding a way to generate keen buyer interest
  • Adding features helps justify price increases
  • The greater the complexity of the system, the less likely the licensee will be able to fiddle with the system
  • A refusal to admit that content processing is a core component of many other types of software so “finding information” has become a standard component for other applications.

If content processing is idiosyncratic, that might explain why investors pour money into content processing companies which have little chance to generate sufficient revenue to pay off investors, generate a profit, and build a sustainable business. Enterprise search and content processing vendors seem to be in a state of reinventing or reimagining themselves. Guitar makers just pursue cost cutting and expand their market. It is not so easy for content processing companies.

Stephen E Arnold, October 4, 2016

US Government: Computer Infrastructure

September 26, 2016

Curious about the cost of maintaining a computer infrastructures. Companies may know how much is spent to maintain the plumbing, but those numbers are usually buried in accounting-speak within the company. Details rarely emerge.

Here’s a useful chart about how much spending for information technology goes to maintain the old stuff and the status quo versus how much goes to the nifty new technology:

image

The important line is the solid blue one. Notice that the US Federal government spent $0.68 cents of every IT dollar on operations and maintenance in 2010. Jump to the 2017 estimate. Notice that the status quo is likely to consume $0.77 cents of every IT dollar.

Progress? If you want to dig into the information behind this chart, you can find the report GAO 677454 by running queries on the Alphabet Google system m. The title of the report is “Information Technology. Federal Agencies Need to Address Aging Legacy Systems.” Don’t bother trying the search box on the GAO.org Web site. The document is not in the index.

If you are not too keen on running old school mobile queries or talking to your nifty voice enabled search system, you can find the document at this link.

I want to point out that Palantir Technologies may see these types of data as proof that the US government needs to approach information technology in a different manner.

Stephen E Arnold, September 26, 2016

Alphabet Google Faces a Secret Foe

September 21, 2016

I thought indexing the world’s information made it possible to put together disparate items of information. Once assembled, these clues from the world of real time online content would allow a person with access to answer a business question.

Apparently for Alphabet Google it faces a secret foe. I learned this by reading “Secretive Foe Attacks Google over Government Influence.” I learned:

Google has come under attack by a mysterious group that keeps mum about its sponsors while issuing scathing reports about the Mountain View search giant’s influence on government.

The blockbuster write up reported:

So far, only Redwood Shores-based Oracle has admitted to funding the Transparency Project, telling Fortune it wanted the public to know about its support for the initiative.

Yikes, a neighbor based at the now long gone Sea World.

The outfit going after the lovable Alphabet Google thing is called the Transparency Group. The excited syntax of the write up told me:

The Transparency Project commenced hostilities against Google in April, gaining national media attention with a report tracking the number of Googlers taking jobs in the White House and federal agencies, and the number of federal officials traveling in the other direction, into Google. Project researchers reported 113 “revolving door” moves between Google — plus its associated companies, law firms and lobbyists — and the White House and federal agencies.

Okay, but back to my original point. With the world’s information at one’s metaphorical fingerprints, is it not possible to process email, Google Plus, user search histories, and similar data laden troves for clues about the Transparency Group?

Perhaps the Alphabet Google entity lacks the staff and software to perform this type of analysis? May I suggest a quick telephone call to Palantir Technologies. From what I understand by reading open source information about the Gotham product, Palantir can knit together disparate and fragmented data and put the members of the Transparency Group on the map in a manner of speaking.

I understand the concept of finding fault with a near perfect company. But the inability of a search giant to find out who, what, when, where, what, how, and why baffles me.

It does not, as an old school engineer with a pocket protector might say, compute.

Stephen E Arnold, September 14, 2016

OpenText: Content Marketing or Real News?

September 18, 2016

When I knew people at the original Malcolm Forbes Forbes, I learned that stories were meticulously researched and edited. I read “Advanced Analytics: Insights Produce New Wealth.” I was baffled, but, I thought, that’s the point.

The main point of the write up pivots on the assertion that an “insight” converts directly to “wealth.” I am not sure about the difference between old and new wealth. Wealth is wealth in my book.

The write up tells me:

Data is the foundation that allows transformative, digital change to happen.

The logic escapes me. The squirrels in Harrod’s Creek come and go. Change seems to be baked into squirreldom. The focus is “the capitalist tool,” and I accept that the notion of changing one’s business can be difficult. The examples are easy to spot: IBM is trying to change into a Watson powered sustainable revenue machine. HP is trying to change from a conglomeration of disparate parts into a smaller conglomeration of disparate parts. OpenText is trying to change from a roll up of old school search systems into a Big Data wealth creator. Tough jobs indeed.

I learned that visualization is important for business intelligence. News flash. Visualization has been important since a person has been able to scratch animals on a cave’s wall. But again I understand. Predictive analytics from outfits like Spotfire (now part of Tibco) provided a wake up call to some folks.

The write up informs me:

While devices attached to the Internet of Things will continue to throw out growing levels of structured data (which can be stored in files and databases), the amount of unstructured data being produced will also rise. So the next wave of analytics tools will inevitably be geared to dealing with both forms of information seamlessly, while also enabling you to embed the insights gleaned into the applications of your choosing. Now that’s innovation.

Let’s recap. Outfits need data to change. (Squirrels excepted.) Companies have to make sense of their data. The data come in structured and unstructured forms. The future will be software able to handle structured and unstructured data. Charts and graphs help. Once an insight is located, founded, presented by algorithms which may or may not be biased, the “insights” will be easy to put into a PowerPoint.

BAE Systems’ “Detica” was poking around in this insight in the 1990s. There were antecedents, but BAE is a good example for my purpose. Palantir Technologies provided an application demo in 2004 which kicked the embedded analytics notion down the road. A click would display a wonky circular pop up, and the user could perform feats of analytic magic with a mouse click.

Now Forbes’ editors have either discovered something that has been around for decades or been paid to create a “news” article that reports information almost as timely as how Lucy died eons ago.

Back to the premise: Where exactly is the connection between insight and wealth? How does one make the leap from a roll up of unusual search vendors like Information Dimension, BRS, Nstein, Recommind, and my favorite old time Fulcrum Technologies produce evidence of the insight to wealth argument. If I NOT out these search vendors and focus on the Tim Bray SGML search engine, I still don’t see the connection. Delete Dr. Bray’s invention. What do we have? We have a content management company which sells content management as governance, compliance, and other consulting friendly disciplines.

Consultants can indeed amass wealth. But the insight comes not from Big Data. The wealth comes from selling time to organizations unable to separate the giblets from the goose feathers. Do you know the difference? The answer you provide may allow another to create wealth from that situation.

One doesn’t need Big Data to market complex and often interesting software to folks who require a bus load of consultants to make it work. For Forbes, the distinction between giblets and goose feathers may be too difficult to discern.

My hunch is that others, not trained in high end Manhattan journalism, may be able to figure out which one can be consumed and which one can ornament an outfit at an after party following a Fashion Week showing.

Stephen E Arnold, September 18, 2016

In-Q-Tel Wants Less Latency, Fewer Humans, and Smarter Dashboards

September 15, 2016

I read “The CIA Just Invested in a Hot Startup That Makes Sense of Big Data.” I love the “just.” In-Q-Tel investments are not like bumping into a friend in Penn Station. Zoomdata, founded in 2012, has been making calls, raising venture funding (more than $45 million in four rounds from 21 investors), and staffing up to about 100 full time equivalents. With its headquarters in Reston, Virginia, the company is not exactly operating from a log cabin west of Paducah, Kentucky.

The write up explains:

Zoom Data uses something called Data Sharpening technology to deliver visual analytics from real-time or historical data. Instead of a user searching through an Excel file or creating a pivot table, Zoom Data puts what’s important into a custom dashboard so users can see what they need to know immediately.

What Zoomdata does is offer hope to its customers for less human fiddling with data and faster outputs of actionable intelligence. If you recall how IBM i2 and Palantir Gotham work, humans are needed. IBM even snagged Palantir’s jargon of AI for “augmented intelligence.”

In-Q-Tel wants more smart software with less dependence on expensive, hard to train, and often careless humans. When incoming rounds hit near a mobile operations center, it is possible to lose one’s train of thought.

Zoomdata has some Booz, Allen DNA, some MIT RNA, and protein from other essential chemicals.

The write up mentions Palantir, but does not make explicit the need to reduce t6o some degree the human-centric approaches which are part of the major systems’ core architecture. You have nifty cloud stuff, but you have less nifty humans in most mission critical work processes.

To speed up the outputs, software should be the answer. An investment in Zoomdata delivers three messages to me here in rural Kentucky:

  1. In-Q-Tel continues to look for ways to move along the “less wait and less weight” requirement of those involved in operations. “Weight” refers to heavy, old-fashioned system. “Wait” refers to the latency imposed by manual processes.
  2. Zoomdata and other investments whips to the flanks of the BAE Systems, IBMs, and Palantirs chasing government contracts. The investment focuses attention not on scope changes but on figuring out how to deal with the unacceptable complexity and latency of many existing systems.
  3. In-Q-Tel has upped the value of Zoomdata. With consolidation in the commercial intelligence business rolling along at NASCAR speeds, it won’t take long before Zoomdata finds itself going to big company meetings to learn what the true costs of being acquired are.

For more information about Zoomdata, check out the paid-for reports at this link.

Stephen E Arnold, September 15, 2016

HonkinNews, September 13, 2016 Now Available

September 13, 2016

Interested in having your polynomials probed? The Beyond Search weekly news explains this preventive action. In this week’s program you will learn about Google new enterprise search solution. Palantir is taking legal action against an investor in the company. IBM Watson helps out at the US Open. Catch up on the search, online, and content processing news that makes the enterprise procurement teams squirm. Dive in with Springboard and Pool Party. To view the video, click this link.

Kenny Toth, September 13, 2016

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta