Cognitive Engine: What Powers the USAF Platform?
May 1, 2019
Last week I met with a university professor who does cutting edge data and text mining and also shepherds PhD candidates. In the course of our 90 minute conversation, I noticed some reference books which had SPSS on the cover. The procedures implemented at this particular university worked well.
After the meeting, I was thinking about the newer approaches which are becoming publicly available. The USAF has started talking about its “cognitive engine.” I thought I heard at a conference that some technology developed developed by Nutonian, now part of a data and text mining roll up, had influenced the project.
The Nutonian system is predictive with a twist. The person using the system can rely on the smart software to perform the numerous intermediary steps required when using more traditional systems.
The article “The US Air Force Will Showcase Its Many Technological Advances in the USAF Lab Day.” The original is in Chinese but Freetranslate.com can help out if don’t read Chinese or have a close by contact who does.
The USAF wants to deploy a cognitive platform into which vendors can “plug in” their systems. The Chinese write up reported:
AFRL’s Autonomy Capability Team 3 (ACT3) is developing artificial intelligence on a large scale through the development and application of the Air Force Cognitive Engine (ACE), an artificial intelligence software platform. Put into application. The software platform architecture reduces the barriers to entry for artificial intelligence applications and provides end-user applications with the ability to cover a range of artificial intelligence problem types. In the application, the software platform connects educated end users, developers, and algorithms implemented in software, task data, and computing hardware to the process of creating an artificial intelligence solution.
The article also provides some interesting details which were not included in some of the English language reports about this session; for example:
- Smart rockets
- An agile pod
- Pathogen identification.
A couple of observations:
First, obviously the Chinese writer had access to information about the Lab Day demonstrations.
Second, the cognitive platform does not mention foundation vendors, which I understand.
Third, it would be delightful to visit a university and see documentation and information about the next-generation predictive analytics systems available.
Stephen E Arnold, May 1, 2019
Here’s what the Chinese writer reported about the
Analytics Leaders: No Google, No Voyager Labs
April 25, 2019
I read “Top 50 Organizations for Data Analytics to Be Honored.” Interesting idea: Identify outfits which are really, really good with analytics: Data mining, text mining, math, and numerical recipes which are yummy, yummy.
The names on the list were a bit of a surprise to me; for instance:
- A football team, Philadelphia Eagles. The Eagles?
- A not for profit with an interesting history, United Way
- A company unable to build a 5G technology based product and unable to deliver certain silicon to some customers, Intel.
What’s up?
I expected that the Google would get a mention and a footnote to Recorded Future, partially funded by Google and In-Q-Tel, the investment arm of the Central Intelligence Agency. Where’s Voyager Labs, the developer of Voyager Analytics?
After reading the article, I am not sure how the list was developed, and I am not confident that the organizations cited for excellence in analytics would make my list of analytics leaders.
But the important thing is the PR.
Stephen E Arnold, April 25, 2019
The Surf Is Up for the Word Dark
April 4, 2019
Just a short note. I read this puffy wuffy write up about a new market research report. Its title?
What caught my attention is not the authors’ attempt to generate some dough via open source data collection and a touch of Excel fever.
Here’s what caught my attention:
Dark analytics is the analysis of dark data present in the enterprises. Dark data is generally is referred as raw data or information buried in text, tables, figures that organizations acquire in various business operations and store it but, is unused to derive insights and for decision making in business. Organizations nowadays are realizing that there is a huge risk associated with losing competitive edge in business and regulatory issues that comes with not analyzing and processing this data. Hence, dark analytics is a practice followed in enterprises that advances in analyzing computer network operations and pattern recognition.
Yes, buried data treasure. Now the cost of locating, accessing, validating, and normalizing these time encrusted nuggets?
Answer: A lot. A whole lot. That’s part of the reason old data are not particularly popular in some organizations. The idea of using a consulting firm or software from SAP is not particularly thrilling to my DarkCyber team. (Our use of “dark” is different too.)
Stephen E Arnold, April 4, 2019
A Statistics Rebellion? One Can Only Hope
March 21, 2019
Yesterday I mentioned to a reporter than most smart software is “right” somewhere between 50 to 80 percent of the time. The reporter asked, “Does that mean results are incorrect half to one third of the time?”
My answer, “Probably worse.”
The reporter changed the subject. My hunch is that the hyperbole about the accuracy of smart software suggests that the systems are better than a human. Some may be better at some specific tasks.
In many cases, the number crunching chops down what a human must examine. In an age of data, chopping down what one has to examine is a very important task. For applications like online advertising, 70 percent accuracy is close enough to keep the advertiser semi happy and spending money to reach a target. For other applications like where will a bad actor commit a crime, the game is “close enough for horseshoes.”
Why talk about numbers? My observations, with which you are invited to disagree, are a prelude to my recommending that you read “Scientists Rise Up Against Statistical Significance.” Here a passage I underlined:
In 2016, the American Statistical Association released a statement in The American Statistician warning against the misuse of statistical significance and P values. The issue also included many commentaries on the subject. This month, a special issue in the same journal attempts to push these reforms further. It presents more than 40 papers on ‘Statistical inference in the 21st century: a world beyond P < 0.05’. The editors introduce the collection with the caution “don’t say ‘statistically significant’”. Another article with dozens of signatories also calls on authors and journal editors to disavow those terms. We agree, and call for the entire concept of statistical significance to be abandoned.
What if one is using a system which bakes in statistical procedures and locks them away from users? What if those procedures are introducing errors?
Tough questions for vendors of smart software.
Stephen E Arnold, March 21, 2019
Who Is Assisting China in Its Technology Push?
March 20, 2019
I read “U.S. Firms Are Helping Build China’s Orwellian State.” The write up is interesting because it identifies companies which allegedly provide technology to the Middle Kingdom. The article also uses an interesting phrase; that is, “tech partnerships.” Please, read the original article for the names of the US companies allegedly cooperating with China.
I want to tell a story.
Several years ago, my team was asked to prepare a report for a major US university. Our task was to try and answer what I thought was a simple question when I accepted the engagement, “Why isn’t this university’s computer science program ranked in the top ten in the US?”
The answer, my team and I learned, had zero to do with faculty, courses, or the intelligence of students. The primary reason was that the university’s graduates were returning to their “home countries.” These included China, Russia, and India, among others. In one advanced course, there was no US born, US educated student.
We documented that for over a seven year period, when the undergraduate, the graduate students, and post doctoral students completed their work, they had little incentive to start up companies in proximity to the university, donate to the school’s fund raising, and provide the rah rah that happy graduates often do. To see the rah rah in action, may I suggest you visit a “get together” of graduates near Stanford or an eatery in Boston or on NCAA elimination week end in Las Vegas.
How could my client fix this problem? We were not able to offer a quick fix or even an easy fix. The university had institutionalized revenue from non US student and was, when we did the research, dependent on non US students. These students were very, very capable and they came to the US to learn, form friendships, and sharpen their business and technical “soft” skills. These, I assume, were skills put to use to reach out to firms where a “soft” contact could be easily initiated and brought to fruition.
Follow the threads and the money.
China has been a country eager to learn in and from the US. The identification of some US firms which work with China should not be a surprise.
However, I would suggest that Foreign Policy or another investigative entity consider a slightly different approach to the topic of China’s technical capabilities. Let me offer one example. Consider this question:
What Israeli companies provide technology to China and other countries which may have some antipathy to the US?
This line of inquiry might lead to some interesting items of information; for example, a major US company which meets on a regular basis with a counterpart with what I would characterize as “close links” to the Chinese government. One colloquial way to describe the situation is like a conduit. Digging in this field of inquiry, one can learn how the Israeli company “flows” US intelligence-related technology from the US and elsewhere through an intermediary so that certain surveillance systems in China can benefit directly from what looks like technology developed in Israel.
Net net: If one wants to understand how US technology moves from the US, the subject must be examined in terms of academic programs, admissions, policies, and connections as well as from the point of view of US company investments in technologies which received funding from Chinese sources routed through entities based in Israel. Looking at a couple of firms does not do the topic justice and indeed suggests a small scale operation.
Uighur monitoring is one thread to follow. But just one.
Stephen E Arnold, March 20, 2019
Data Visualization: Unusual and Unnecessary Terminology
March 19, 2019
I read “5 Reasons Why Data Visualization Fails.” Most of the information in the write up applies to a great many visualizations. I have seen some pretty crazy graphs in my 50 year career. A few stand out. The Autonomy heat maps. Wild and crazy radar maps. Multi axis charts which are often incomprehensible.
The problem is that point and click options present data. The “analyst” often picks a graph that keeps a general, a partner in a venture firm, or a group of rubes entranced.
The article touches upon other issues ranging from a failure to think about the audience to presenting complex visualizations.
I do have one major objection to the article. From my point of view, the “phrase data overload” or “large flows of information” express the concept of having a great deal of information. The article uses the phrase “data puking.” The phrase is unnecessary and off putting to me.
Stephen E Arnold, March 19, 2019
A Justification of Making Things Up?
March 13, 2019
I read “Gut Feelings Often Trump Real Data in Driving Business Decisions, Says Forrester.” The write up is interesting for several reasons. First, Forrester, like other mid tier consulting firms, generates reports about companies with more subjective than objective data. Examples range from pricing data, information from customers about the product or service offered by a company, and concrete information about management compensation, financial performance, and similar data. The metaphor of a wave is compelling but data within would be helpful.
Second, the notion of “real data” underscores that talk about data is often just that—chatter, jargon, baloney. “Real data” are difficult to obtain. For example, a company provides a system which tracks and indexes content in the “hidden Web.” What’s the benchmark? How much data are tracked? How much are not indexable? Other questions like this can be answered but time and money are one hurdle. The real reason is that no one wants to make the effort to get data which can be analyzed and then evaluated in head to head comparisons. “Real data”, such as information spewed from financial analysis spreadsheets, is not examined with care. Dig in and the numbers can wobble. Did a scrutinized company actually cut expenses, or does the spreadsheet report that data in bucket A went away and data in bucket B became larger?
Third, the write up itself emphasizes that visualization, not grubby numbers is where the action is. The future of analysis may be an anigif showing the harried decision maker what he or she needs to know. Who has time to work through data by hand, then comparing those data to other information from other sources?
Quite a write up. Interesting implications. Subjective analysis washes away facts in my experience.
Stephen E Arnold, March 13, 2019
Good News about Big Data and AI: Not Likely
February 25, 2019
I read a write up which was a bit of a downer. The story appeared in Analytics India and was titled “10 Challenges That Data Science Industry Still Faces.” Oh, oh. Maybe not good news?
My first thought was, “Only 10?”
The write up explains that the number one challenge is humans. The idea that smart software would solve these types of problems: Sluggish workers at fast food restaurants, fascinating decisions made by entry level workers in some government bureaus, and the often remarkable statements offered by talking heads on US cable TV “real news” programs, among others.
Nope. The number one challenge is finding humans who can do data science work.
What’s number two after this somewhat thorny problem? The answer is finding the “right data” and then getting a chunk of data one can actually process.
So one and two are what I would call bedrock issues: Expertise and information.
What about the other eight challenges. Here are three of them. I urge you to read the original article for the other five issues.
- Informing people why data science and its related operations are good for you. Is this similar to convincing a three year old that lima beans are just super.
- Storytelling. I think this means, “These data mean…” One hopes the humans (who are in short supply) draw the correct inferences. One hopes.
- Models. This is a shorthand way of saying, “What’s assembled will work.” Hopefully the answer is, “Sure, our models are great.”
Analytics India has taken a risk with their write up. None of the data science acolytes want to hear “bad news.”
Let’s federate and analyze that with great data we can select to generate a useful output. Maybe 80 percent “accuracy” on a good day?
Stephen E Arnold, February 25, 2019
Gartner Does the Gartner Thing: Mystical Augmented Analytics
February 19, 2019
Okay, okay, Gartner is a contender for the title of Crazy Jargon Creator 2019.
I read “Gartner: Augmented Analytics Ready for Prime Time.” Yep, if Datanami says so, it must be true.
Here’s the line up of companies allegedly in this market. I put the companies in alphabetical order with the Gartner objective, really really accurate BCG inspired quadrant “score” after each company’s name. Ready, set, go!
BOARD International—niche player
Birst—niche player
Domo—niche player
GoodData—niche player
IBM—niche player
Information Builders—niche player
Logi Analytics—niche player
Looker—niche player
MicroStrategy—challenger
Microsoft—leader
Oracle—niche player
Pyramid Analytics—niche player
Qlik—leader
SAP—visionary
SAS—visionary
Salesforce—visionary
Sisense—visionary
TIBCO Software—visionary
Tableau—leader
ThoughtSpot—leader
Yellowfin—niche player
Do some of these companies and their characterization—sorry, I meant really really objective inclusion—strike you as peculiar? What about the mixing of big outfits like IBM which has been doing Fancy Dan analytics decades before it acquired i2 Ltd. Analyst’s Notebook? I also find the inclusion of SAS a bit orthogonal with the omission of IBM’s SPSS, but IBM is a niche player.
That’s why Gartner is the jargon leader at this point in 2019, but who knows? Maybe another consulting firm beating the bushes for customers will take the lead. The year is still young.
Stephen E Arnold, February 19, 2019
Analytic Hubs: Now You Know
January 30, 2019
Gartner Group has crafted a new niche. I learned about analytic hubs in Datanami. The idea is that a DMSA or data management solution fro analytics is now a thing. Odd. I thought that companies have been providing data analytics hubs for a number of years. Oh, well, whatever sells.
The DMSA vendor list in “What Gartner Sees in Analytic Hubs” is interesting. Plus the write up includes one of the objective, math based, deeply considered Boston Consulting Group quadrants which make some ideas so darned fascinating. I mean Google. An analytics hub?
Based on information in the write up, here are the vendors who are the movers and shakers in analytic hubs:
Alibaba Cloud
Amazon Web Services
Arm
Cloudera
GBase
Hortonworks
Huawei
IBM
MapR Technologies
MarkLogic
Micro Focus
Microsoft
Neo4
Oracle
Pivotal
SAP
Snowflake
Teradata
This is an interesting list. It seems the “consultants” at Gartner, had lunch, and generated a list with names big and small, known and unknown.
I noted the presence of Amazon which is reasonable. I was surprised that the reference to Oracle did not include its stake in a vendor which actually delivers the “hubby” functions to which the write up alludes. The inclusion of MarkLogic was interesting because that company is a search system, an XML database, and annoyance to Oracle. IBM is fascinating, but which “analytic hub” technology is Gartner considering unknown to me. One has to admire the inclusion of Snowflake and MapR Technologies.
I suppose the analysis will fuel a conference, briefings, and consulting revenue.
Will the list clarify the notion of an analytics hub?
Yeah, that’s another issue. It’s Snowflake without the snow.
Stephen E Arnold, January 30, 2019