October 22, 2016
I followed a series of links to three articles about IBM Watson. Here are the stories I read:
- Watson’s the Name, Data’s the Game, dated October 10, 2016
- Milestones Along the Way in Watson’s Colorful History, October 10, 2016
- It’s (Not) Elementary: How Watson Works, October 10, 2016
The publication running these three write ups is Computerworld, which I translated as “ComputerWatson.”
Intrigued by the notion of “news,” I learned:
Watson uses some 50 technologies today, tapping artificial-intelligence techniques such as machine learning, deep learning, neural networks, natural language processing, computer vision, speech recognition and sentiment analysis.
But IBM does not like the idea of artificial intelligence even though I have spotted such synonyms in other “real” news write ups; for example, “augmented intelligence.”
There are factoids like “Watson can read more than 800 pages a second.” Figure 125 words per “page” and that works out to 100,000 words per second which is a nice round number. Does Watson perform this magic on a basic laptop? Probably not. What are the bandwidth and storage requirements? Oh,not a peep.
Computerworld—I mean ComputerWatson—provides a complete timeline of the technology too. The future begins in 1997. Imagine that. Boom. Watson wins at chess.
The “history” of Watson is embellished with a fanciful account of how IBM trained via humans assembling information. How much does corpus assembly cost? ComputerWatson—oh, I meant “Computerworld”—does not dive into investment.
To make Watson’s inner workings clear, the “real” news write up provides a link to an IBM video. Here’s an example of the cartoonish presentation:
These three write ups strike me as a public relations exercise. If IBM paid Computerworld to write and run these stories, the three articles are advertising. Who wrote these “news stories”? The byline is Katherine Noyes, who describes herself as “an ardent geek.” Her beat? Enterprise software, cloud computing, big data, analytics, and artificial intelligence.
Remarkable stuff but I had several thoughts:
- Not much “news” is included in the articles. It seems to me that the information has appeared in other writings.
- IBM Watson is working overtime to be recognized as the leader in the smart software game. That’s okay, but IBM seems to be pushing big markets with no easy way to monetize its efforts; for example, education, cancer, and game show participation.
- The Computerworld IBM Watson content party strikes me as eroding the credibility of both outfits.
Oh, I remember. Dave Schubmehl, the fellow who tried to sell on Amazon reports containing my research without my consent, was hooked up with IDG. I have lost track of the wizard, but I do recall the connection. More information is here.
Yep, credibility for possible content marketing and possible presentation of “news” as marketing collateral. Fascinating. Perhaps I should ask Watson: “What’s up?”
Stephen E Arnold, October 22, 2016
October 11, 2016
I read “IBM Watson’s CMO Predicts the Future of Data and AI.” I thought that the article would report what IBM Watson had to say about this question: “What is the future of data and AI, Watson?” Wrong. The article presents IBM’s current thinking about what its humanoids desperately want IBM Watson to become.
There was one startling omission in the article, but I will save that until the final paragraph of this mini report.
I noted several points of interest to me in the write up which is essentially an IBM wizard answering some slightly worn questions about the sprawling brand known as Watson. (Keep in mind that I know Watson as Lucene, home brew code, and acquired technologies.)
Point One: Watson understands human language. So Watson is like the film “2001” and HAL? No, here’s what the write up says Watson is:
It’s not speech recognition like Siri, not speech synthesis like Alexa, but actually understanding human languages…
Point Two: Why use IBM Watson and not some other smart system? Answer:
We’ve invested $6 billion in our idea, a third of that is dedicated to cognitive.
Point Three: IBM made a big deal about Twitter in 2014. IBM’s position:
Twitter specifically, is interesting.
You get the idea. Superficial generalizations about how capable IBM Watson is.
What’s the big omission? Revenue. Not a peep about how IBM Watson is going to generate sustainable revenue this quarter. What’s frightening to me is that the humanoid answers about Watson are sketchy. Since Watson did not answer the questions or address the topics in the title of the source article, I conclude that IBM Watson’s answers are even more sketchy.
I love that multi billion investment, however. Now about the financial payoff. Watson, any answers?
Stephen E Arnold, October 11, 2016
October 11, 2016
The lawless domain just got murkier. Apart from illegal firearms, passports, drugs and hitmen, you now can procure a verifiable college degree or diploma on Dark Web.
Cyber criminals have created a digital marketplace where unscrupulous students can
purchase or gain information necessary to provide them with unfair and illegal
academic credentials and advantages.
The certificates for these academic credentials are near perfect. But what makes this cybercrime more dangerous is the fact that hackers also manipulate the institution records to make the fake credential genuine.
The article ADDS:
A flourishing market for hackers who would target universities in order to change
grades and remove academic admonishments
This means that under and completely non-performing students undertaking an educational course need not worry about low grades or absenteeism. Just pay the hackers and you have a perfectly legal degree that you can show the world. And the cost of all these? Just $500-$1000.
What makes this particular aspect of Dark Web horrifying interesting is the fact that anyone who procures such illegitimate degree can enter mainstream job market with perfect ease and no student debt.
October 2, 2016
When one needs to understand Big Data, what’s the go to resource? A listicle of Big Data generalities. Navigate to “6 Illusions Execs Have About Big Data.” The article points out that Big Data is a buzzword. Shocker. And the chimera identified? Here you go:
- All data is Big Data. Yep, a categorical affirmative. Love those “all’s”.
- Big Data solves every problem. Another categorical affirmatives. Whether it is the Zucks’s curing “all” disease or Big Data dealing with “every” problem, the generalization is rock solid silliness.
- Big Data is meaningless. The statement leads to this parental observation: “To make big data less meaningless, you need to be able to process and use it.” I am curious about the cost, method, and accuracy of the outputs in the real world.
- Big Data is easy. The enumeration of the attributes of a pair of women’s shoes resonates with me. I like flats.
- Imperfect Big Data is useless. Nope, Imperfect.Many data sets have imperfections. The hard work is normalizing and cleaning the information.
- Only big companies need big data. I like the balanced sentence structure and repetition. The reality, however, is that small outfits often struggle with little data. A data set can easily overwhelm the small outfit’s resources and act like a stuck parking brake when closing deals and generating revenue are Jobs 1 and 2.
Amazing stuff. When I encounter information similar to that contained in the source document, I understand how many vendors close deals, pocket dough, and leave the lucky buyers wondering what happened to their payoff from Big Data.
Stephen E Arnold, October 2, 2016
September 30, 2016
I believe Google is working on the solution to death. Microsoft, aced out of the death challenge, has turned its attention to cancer. I read “Microsoft Will ‘Solve’ Cancer within 10 Years by ‘Reprogramming’ Diseased Cells.” I learned that Microsoft
has assembled a “small army” of the world’s best biologists, programmers and engineers who are tackling cancer as if it were a bug in a computer system.
The write up added:
The biological computation group at Microsoft are developing molecular computers built from DNA which act like a doctor to spot cancer cells and destroy them.
First, I wonder if Microsoft might want to get Kindles and Web cams working with Windows 10. Perhaps a less lofty goal than solving cancer, some Windows 10 users might find the fixes helpful.
Second, will Microsoft improve upon its software development so that Tay type errors do not inadvertently cause cancer cells to become more robust. Microsoft’s artificial intelligence has performed in amusing ways, but solving cancer seems a bit more difficult than chatting. Microsoft Tay did not impress.
Third, if Google indeed does solve death, does that not suggest that Google has also solved cancer?
No answers, but the publicity machine is working quite well.
Stephen E Arnold, September 30, 2016
September 27, 2016
I read “This Is Everything Edward Snowden Revealed in Just One Year of Unprecedented Top-Secret Leaks.” I love “everything” articles. If you follow the Snowden documents, you know that these are scattered across different sites. Most of the write ups referencing the documents point to mini versions of the slides. I had high hopes that this write up would create a list of direct links to downloadable PDFs. No such luck. My conclusion about the article is that it does little to make the Snowden documents more readily available. Nevertheless, I love writes ups with the word “everything” in their title. Easy to say. Either too difficult, too time consuming, or to risky to do.
Stephen E Arnold, September 27, 2016
September 19, 2016
If you are struggling to fill the sales pipeline, you will feel some pressure. If you really need to make sales, marketing collateral may be an easy weapon to seize.
I read “Examples of False Claims about Self-Service Analytics.” The write up singles out interesting sales assertions and offers them up in a listicle. I loved the write up. I lack the energy to sift through the slices of baloney in my enterprise search files. Therefore, let’s highlight the work the brave person who singled out eight vendors’ marketing statements as containing what the author called “false claims.” Personally I think each of these claims is probably rock solid when viewed from the point of view of the vendors’ legal advisers.
Here are three examples of false claims about self service analytics. (For the other five, consult the article cited in the preceding paragraph.) Keep in mind that I find these statements as good as the gold for sale in the local grocery in Harrod’s Creek. Come to think of it the gold is foil wrapped around a lousy piece of ersatz chocolate. But gold is gold sort of.
Example 1 from Information Builders. Information Builders loves New York. The company delivers “integrated solutions for data management.” Here’s the item from the article which contains a “false claim.”
Self-service BI and analytics isn’t just about giving tools to analysts; it’s about empowering every user with actionable and relevant information for confident decision-making. (link). Self-service Analytics for Everyone…Who’s Everyone? Your entire universe of employees, customers, and partners. Our WebFOCUS Business Intelligence (BI) and Analytics platform empowers people inside and outside your organization to attain insights and make better decision.
I see a subject verb agreement error. I see a semicolon which puts me on my rhetorical guard. I see the universal “everyone”. I see the fact that WebFOCUS empowers.
What’s not to like? Information Builders is repeating facts which I accept. The fact that the company is in New York enhances the credibility of the statements. Footnotes, evidence? Who needs them?
Example 2 from SAP, the outfit that delivered R3 to Westinghouse and Trex to the enterprise search market. Here’s the “false assertion” which looks as solid as a peer reviewed journal containing data related to drug trials. Remember. This quote comes from the source article. I believe absolutely whatever SAP communicates to me. Don’t you?
This tool is intended for those who need to do analysis but are not Analysts nor wish to become them.
Why study math, statistics, and related disciplines? Why get a degree? I know that I can embrace the SAP way (which is a bit like the IBM way) and crunch numbers until the cows return to my pasture in Harrod’s Creek. Who needs to worry about data integrity, sample size, threshold settings, and algorithmic sequencing? Not me. Gibraltar does not stand on such solid footing as SAP’s tool for those who eschew analysts and does not want to wake up like Kafka’s protagonist as an analyst.
Example 3 from ZoomData, a company which has caught the attention of some folks in the DC area. I love those cobblestones in Reston too.
ZoomData brings the power of self-service BI to the 99%—the non-data geeks of the world who thirst for a simple, intuitive, and collaborative way to visually interact with data to solve business problems.
To me this looks better than the stone tablets Moses hauled down from the mountain. I love the notion of non geeks who thirst for pointing and clicking. I would have expressed the idea as drink deep of data’s Empyrean spring, but I am okay with the split infinitive “to visually interact” because the statement is a fact. I tell you that it is a fact.
For the other five allegedly false assertions, please, consult the original article. I have to take a break. When my knowledge is confirmed in these brilliant assertions, I need a moment to congratulate myself on my wisdom. Wait. I am an addled goose. Maybe these examples really are hog wash? Because i live in rural Kentucky, I will step outside and seek inputs from Henrietta, my hog.
Stephen E Arnold, September 19, 2016
September 18, 2016
When I knew people at the original Malcolm Forbes Forbes, I learned that stories were meticulously researched and edited. I read “Advanced Analytics: Insights Produce New Wealth.” I was baffled, but, I thought, that’s the point.
The main point of the write up pivots on the assertion that an “insight” converts directly to “wealth.” I am not sure about the difference between old and new wealth. Wealth is wealth in my book.
The write up tells me:
Data is the foundation that allows transformative, digital change to happen.
The logic escapes me. The squirrels in Harrod’s Creek come and go. Change seems to be baked into squirreldom. The focus is “the capitalist tool,” and I accept that the notion of changing one’s business can be difficult. The examples are easy to spot: IBM is trying to change into a Watson powered sustainable revenue machine. HP is trying to change from a conglomeration of disparate parts into a smaller conglomeration of disparate parts. OpenText is trying to change from a roll up of old school search systems into a Big Data wealth creator. Tough jobs indeed.
I learned that visualization is important for business intelligence. News flash. Visualization has been important since a person has been able to scratch animals on a cave’s wall. But again I understand. Predictive analytics from outfits like Spotfire (now part of Tibco) provided a wake up call to some folks.
The write up informs me:
While devices attached to the Internet of Things will continue to throw out growing levels of structured data (which can be stored in files and databases), the amount of unstructured data being produced will also rise. So the next wave of analytics tools will inevitably be geared to dealing with both forms of information seamlessly, while also enabling you to embed the insights gleaned into the applications of your choosing. Now that’s innovation.
Let’s recap. Outfits need data to change. (Squirrels excepted.) Companies have to make sense of their data. The data come in structured and unstructured forms. The future will be software able to handle structured and unstructured data. Charts and graphs help. Once an insight is located, founded, presented by algorithms which may or may not be biased, the “insights” will be easy to put into a PowerPoint.
BAE Systems’ “Detica” was poking around in this insight in the 1990s. There were antecedents, but BAE is a good example for my purpose. Palantir Technologies provided an application demo in 2004 which kicked the embedded analytics notion down the road. A click would display a wonky circular pop up, and the user could perform feats of analytic magic with a mouse click.
Now Forbes’ editors have either discovered something that has been around for decades or been paid to create a “news” article that reports information almost as timely as how Lucy died eons ago.
Back to the premise: Where exactly is the connection between insight and wealth? How does one make the leap from a roll up of unusual search vendors like Information Dimension, BRS, Nstein, Recommind, and my favorite old time Fulcrum Technologies produce evidence of the insight to wealth argument. If I NOT out these search vendors and focus on the Tim Bray SGML search engine, I still don’t see the connection. Delete Dr. Bray’s invention. What do we have? We have a content management company which sells content management as governance, compliance, and other consulting friendly disciplines.
Consultants can indeed amass wealth. But the insight comes not from Big Data. The wealth comes from selling time to organizations unable to separate the giblets from the goose feathers. Do you know the difference? The answer you provide may allow another to create wealth from that situation.
One doesn’t need Big Data to market complex and often interesting software to folks who require a bus load of consultants to make it work. For Forbes, the distinction between giblets and goose feathers may be too difficult to discern.
My hunch is that others, not trained in high end Manhattan journalism, may be able to figure out which one can be consumed and which one can ornament an outfit at an after party following a Fashion Week showing.
Stephen E Arnold, September 18, 2016
September 15, 2016
Naturally, tracking stolen data through the dark web is a challenge. Investigators have traditionally infiltrated chatrooms and forums in the effort—a tedious procedure with no guarantee of success. Now, automated tools may give organizations a leg up, we learn from the article, “Tools to Track Stolen Data Through the Dark Web” at GCN. Reporter Mark Pomerleau informs us:
“The Department of Veterans Affairs last month said it was seeking software that can search the dark web for exploited VA data improperly outside its control, distinguish between VA data and other data and create a ‘one-way encrypted hash’ of VA data to ensure that other parties cannot ascertain or use it. The software would also use VA’s encrypted data hash to search the dark web for VA content. We learned:
Some companies, such as Terbium Labs, have developed similar hashing technologies. ‘It’s not code that’s embedded in the data so much as a computation done on the data itself,’ Danny Rogers, a Terbium Labs co-founder, told Defense One regarding its cryptographic hashing. This capability essentially enables a company or agency to recognize its stolen data if discovered. Bitglass, a cloud access security broker, uses watermarking technology to track stolen data. A digital watermark or encryption algorithm is applied to files such as spreadsheets, Word documents or PDFs that requires users to go through an authentication process in order to access it.
We’re told such watermarks can even thwart hackers trying to copy-and-paste into a new document, and that Bitglass tests its tech by leaking and following false data onto the dark web. Pomerleau notes that regulations can make it difficult to implement commercial solutions within a government agency. However, government personnel are very motivated to find solutions that will allow them to work securely outside the office.
The article wraps up with a mention of DARPA’s Memex search engine, designed to plumb the even-more-extensive deep web. Law enforcement is currently using Memex, but the software is expected to eventually make it to the commercial market.
Cynthia Murrell, September 15, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
September 13, 2016
I was surprised by the information presented in “SAP Hana Implementation Pattern Research Yields Contradictory Results.” My goodness, I thought, an online publication actually presents some ideas that a high profile system may not be a cat fully dressed in pajamas.
The SAP Hana system is a database. The difference between Hana and the dozens of other allegedly next generation data management solutions is its “in memory, columnar database platform.” If you are not hip to the lingo of the database administrators who clutch many organizations by the throat, an in memory approach is faster than trucking back to a storage device. Think back to the 1990s and Eric Brewer or the teens who rolled out Pinpoint.
The columnar angle is that data is presented in stacks with each item written on a note card. The mapping of the data is different from a row type system. The primary key in a columnar structure is the data, which maps back to the the row identification.
The aforementioned article points to a mid tier consulting firm report. That report by an outfit called Nucleus Research. Nucleus, according to the article, “revealed that 60 percent of SAP reference customers – mostly in the US – would not buy SAP technology again.” I understand that SAP engenders some excitement among its customers, but a mid tier consulting firm seems to be demonstrating considerable bravery if the data are accurate. Many mid tier consulting firms sand the rough edges off their reports.
The article then jumps to a report paid for by an SAP reseller, which obviously has a dog in the Nucleus fight. Another mid tier research outfit called Coleman Parks was hired to do another study. The research focused on 250 Hana license holders.
The results are interesting. I learned from the write up:
When asked what claims for Hana were credible, 92% of respondents said it reduced IT infrastructure costs, a further 87% stated it saved business costs. Some 98% of Hana projects came in on-budget, and 65% yet to roll out were confident of hitting budget.
Yep, happy campers who are using the system for online transactional processing and online analytical processing. No at home chefs tucking away their favorite recipes in Hana I surmise.
However, the report allegedly determined what I have known for more than a decade:
SAP technology is often deemed too complex, and its CEO Bill McDermott has been waging a public war against this complexity for the past few years, using the mantra Run Simple.
The rebuttal study identified another plus for Hana:
“We were surprised how satisfied the Hana license holders were. SAP has done a good job in making sure these projects work, and rate at which has got Hana out is amazing for such a large organization,” said Centiq director of technology and services Robin Webster. “We had heard a lot about Hana as shelfware, so we were surprised at the number saying they were live.”
From our Hana free environment in rural Kentucky, we think:
- Mid tier consulting firms often output contradictory findings when reviewing products or conducting research. If there is bias in algorithms, imagine what might luck in the research team members’ approaches
- High profile outfits like SAP can convince some of the folks with dogs in the fight to get involved in proving that good things come to those who have more research conducted
- Open source data management systems are plentiful. Big outfits like Hewlett Packard, IBM, and Oracle find themselves trying to generate the type of revenue associated with proprietary, closed data management products at a time when fresh faced computer science graduates just love free in memory solutions like Memsql and similar solutions.
SAP mounted an open source initiative which I learned about in “SAP Embraces Open Source Sort Of.” But the real message for me is that one can get mid tier research firms to do reports. Then one can pick the one that best presents a happy face to potential licensees.
Here in Harrod’s Creek, the high tech crowd tests software before writing checks. No consultants required.
Stephen E Arnold, September 13, 2016