Will More Big Data Make AI Deliver Results
April 6, 2020
Many companies have issued news releases about their coronavirus research support. Personally I find the majority of these “real news” announcements low ball marketing at its finest. The coronavirus problem is indeed serious, and researchers, art history majors, and MBA executives who hop on the “We are helping” bandwagon are amusing.
I read a 3,500 ZDNet article titled:
AI Runs Smack Up Against a Big Data Problem in COVID-19 Diagnosis. Researchers around the world have quickly pulled together combinations of neural networks that show real promise in diagnosing COVID-19 from chest X-rays and CT scans. But a lack of data is hampering the ability of many efforts to move forward. Some kind of global data sharing may be the answer.
Now that’s an SEO inspired title, but the write up makes one amazing assertion: More data will allow medical AI systems to output actionable information.
If I run through the litany of medical AI revolutions, my fingers would get tired clicking and mousing. The IBM Watson silliness is a good example, and it encapsulates the problem of using collections of numerical recipes to help physicians deal with cancer. Google has not made much, if any, progress on solving death. Remember that “hard problem.” Pushing deeper into the past there was NuTech Solutions’ ability to identify individuals likely to get diabetes based on sparse data and ant algorithms.
How did these companies’ efforts work out?
Failures from my point of view.
The write up runs down a number of research efforts. Companies like DarwinAI are mentioned. There are quotes which provide guidance to organizations challenged to find the snack room; for example:
“I think it would help if the WHO made a central database with de-identifying mechanisms, and some really good encryption,” said Dr. Luccioni. “That way, local health authorities would be reassured and motivated to share their data with each other.”
The problem is that smart software is mostly implementation of methods known in some cases for hundreds of years. These smart systems use recursion, feedback loops, and statistical procedures to output statistically valid (probable) information.
How are these systems working? There are data, but they are conflicting, disorganized, and inconsistent. News flash. That’s how information is. There is zero evidence that more data can be verified, normalized, processed in near real time to allow smart software to demonstrate it can do more than generate marketing collateral.
The companies pitching their artificial intelligence should articulate the reality of the outputs their workflows of algorithms can actually generate.
That might help more than the craziness of wanting data to be better, having some magic wand to normalize the messy real world of information, and converting what are mostly graduate school projects into something useful beyond speeding up some lab tests and getting a “real” job.
Will this happen? Not for a long time. Data are not the problem. Humans are the problem because the idea of creating a consistent, verified repository of on point data has not been achieved for small domains of content. Forget global data.
Don’t believe me. Check out any online system. Run some queries. Is “everything” in that system or federated system? What about a small collection of data; for example, the data on your mobile? What’s there that you can access? “What?” you ask. Yeah, the high value data are sucked away and those data are not shared with “everyone” including you who created the data in the first place.
Smart software performs some useful functions. Will data make Bayesian methods or patented techniques like those from Qure.ai “solve” Covid?
Hard in reality. Easy in ZDNet articles. Even easier for marketers. And the patients suffering? What? Who? Where?
Exactly.
Stephen E Arnold, April 5, 2020
GeoSpark Analytics: Real Time Analytics
April 6, 2020
In late 2017, OGSystems chopped out some of the firm’s analytics capabilities. The new company was Geospark Analytics. The service provided enabled customers like the US Department of Defense and FEMA to obtain information about important new events. “Events” is jargon for an alert plus data about something that is important.
“FEMA Contractor Tracing Coronavirus Deaths Uses Web Scraping, Social Media Monitoring” explains one use of the system. The write up says:
Geospark Analytics combines machine learning and big data to analyze events in real-time and warn of potential disruptions to the businesses of high-dollar private and public clientele…
Like Bluedot in Canada, Geospark was one of the monitoring companies analyzing open source and some specialized data to find interesting events. The write up continues:
Geospark Analytics’ product, called Hyperion, the namesake of the Titan son of Uranus (meaning, “watcher from above”), fingered Wuhan as a “hotspot,” in the company’s parlance, within hours after news of the virus first broke. “Hotspots tracks normal patterns of activity across the globe and provides a visual cue to flag disruptive events that could impact your employees, operations, and investments and result in billions of dollars in economic losses,” the company’s website says.
Engadget points out that there are a couple of companies with the name “Geospark.” DarkCyber finds this interesting. This statement provides more color about the Geospark approach:
Geospark Analytics claims to have processed “6.8 million” sources of information; everything from tweets to economic reports. “We geo-position it, we use natural language processing, and we have deep learning models that categorize the data into event and health models,” Goolgasian [Geospark’s CEO] said. It’s through these many millions of data points that the company creates what it calls a “baseline level of activity” for specific regions, such as Wuhan. A spike of activity around any number of security-, military-, or health-related topics and the system flags it as a potential disruption.
How does Geospark avoid the social media noise, bias, and disinformation that finds its way into open source content? The article states:
“We rely more on traditional data sources and we don’t do anything that isn’t publicly available,” Goolgasian said, echoing a common refrain among data firms that fuel surveillance products by mining the internet itself.
Providing specialized services to government agencies is not much of a surprise in DarkCyber’s opinion. Financial firms can also be avid consumers of real-time data. The idea is to get the jump on the competition which probably has its own source of digital insights.
Other observations:
- The apparent “surprise” threading through the Engadget article is a bit off putting. DarkCyber is aware of a number of social media and specialized content monitoring services. In fact, there is a surplus of these operations and not all will survive in the present business climate.
- Detecting and alerting are helpful but the messengers failed to achieve impact. How does DarkCyber know? Well, there is the lockdown.
- Publicizing what companies like Geospark and others do to generate income can have interesting consequences.
Net net: Some types of specialized services are difficult to explain in a way that reduces blowback. Some of the blowback have significant impact on social media analytics companies. The Geofeedia case is a reminder. I know. I know. “What’s a Geofeedia some may ask?”
Good question and DarkCyber thinks few know the answer. Plucking insights from information many people believe to be privileged can be fraught with business shock waves.
Stephen E Arnold, April 6, 2020
AWS Data Marketplace Trundles Along
April 6, 2020
Interesting story appeared in Channel Life. “AWS Marketplace & Data Exchange Open to A/NZ Channel Partners” reports that “global customers are able to purchase directly from A/NZ providers through the two AWS platforms.”
DarkCyber noted this statement in the write up:
“The development and adoption of Insurtech into insurance businesses places a greater focus on customer centricity, real time reporting and improved business efficiencies,” comments JAVLN chief executive officer Dale Smith. “Through AWS Marketplace and with the support of members of the AWS Partner Network, customers will now have access to JAVLN’s insurance software platform and services internationally.”
Data and services are available. “Data” is more interesting than services along in DarkCyber’s view.
Why will companies and Australian government agencies gravitate to AWS? Probably the same reason Middle Eastern countries are using the platform framework:
Farrago AI founder and CEO Asa Cox says AWS Marketplace is a key part of the company’s regional reach for predictive analytics and machine learning products. “The ease of purchase, integration, and consumption makes the AWS Marketplace a no-brainer for existing customers to find new value added products from the partner community.”
Interesting.
Stephen E Arnold, April 6, 2020
Techspert: Search and Experts
April 6, 2020
“How Our AI Search Technology Finds Experts Others Can’t” provides a crunchy description about an application of artificial intelligence. Techspert.io provides a diagram of its approach:
The idea is that the approach operates with pinpoint precision. Then a semantic search engine is used to identify context. The old school lingo was Endeca’s Guided Search or maybe side search. Then a social graph is generated. That’s a relationship map like those used by i2 Ltd’s Analysts Notebook in the early 1990s. The i2 Ltd outfit had some Cambridge grads on its team. Finally the system can identify candidates.
What’s interesting is that the pinpoint angle appears to focus on a narrow domain; that is, individuals in STM with a focus on the M (medicine, biotechnology, etc.). This approach reduces the difficulty of indexing for any business or technical discipline. Focus means that descriptive terms are narrower than general business lingo. Second, the crawling for specialized personnel becomes somewhat easier because many sites can be ignored because they are not related to medicine and related fields; for example, the garden gnome site www.designsoscano.com. Plus, the social graph complexity can be reduced by applying qualifiers that NOT out individuals and other entities unrelated to the focus of Techspert.io; for example, David Drummond and Jennifer Blakely.
Several observations are warranted:
- The implemented method is useful when deployed in a focused way; that is, vertical search for different “terminologies”.
- Scaling the approach across different content domains may require innovative engineering. And the engineering solutions will be expensive to implement, update, and enhance.
- Generating market magnetism will require effective marketing and sales programs. Business development must generate sufficient revenue because once certain hires are made by a company, the recruiting service is put on ice; and sustainable revenues will have to come from recruiting services which offer lower costs, perquisites to customers, etc. These factors may inhibit some venture cash investments.
Worth monitoring this firm. A pivot may be necessary due to the uncertain economic environment.
Stephen E Arnold, April 6, 2020
Wolfcom, Body Cameras, and Facial Recognition
April 5, 2020
Facial recognition is controversial topic and is becoming more so as the technology advances. Top weapons and security companies will not go near facial recognition software due to the cans of worms it would open. Law enforcement agencies want these companies to add it. Wolfcom is actually adding facial recognition to its cameras. Techdirt has the scoop on the story, “Wolfcom Decides It Wants To Be The First US Body Cam Company To Add Facial Tech To Its Products.”
Wolfcom makes body camera for law enforcement and they want to add facial recognition technology to their products. Currently Wolfcom is developing facial recognition for its newest body cam, Halo. Around one thousand five hundred police departments have purchased Wolfcam’s body cameras.
If Wolfcom is successful with its facial recognition development, it would be the first company to have body cameras that use the technology. The technology is still in development according to Wolfcom’s marketing. Right now, their facial recognition technology rests on taking individuals’ photos, then matching them against a database. The specific database is not mentioned.
Wolfcom obviously wants to be an industry leader, but it is also being careful about no making false promises or drumming up bad advertising:
“About the only thing Wolfcom is doing right is not promising sky high accuracy rate for its unproven product when pitching it to government agencies. That’s the end of the “good” list. Agencies who have been asked to beta test the “live” facial recognition AI are being given free passes to use the software in the future, when (or if) it actually goes live. Right now, Wolfcom’s offering bears some resemblance to Clearview’s: an app-based search function that taps into whatever databases the company has access to. Except in this case, even less is known about the databases Wolfcom uses or if it’s using its own algorithm or simply licensing one from another purveyor.”
Wolfcom could eventually offer realtime facial recognition technology and that could affect some competitors.
Whitney Grace, April 5, 2020
Cambridge Analytica Alum: Social Media Is Like Bad, You Know
April 4, 2020
A voice of (in)experience describes how tech companies can be dangerous when left unchecked. Channel News Asia reports, “Tech Must Be Regulated Like Tobacco, says Cambridge Analytica Whistleblower.” Christopher Wylie is the data scientist who exposed Cambridge Analytica’s use of Facebook data to manipulate the 2016 presidential election, among others. He declares society has yet to learn the lesson of that scandal. Yes, Facebook was fined a substantial sum, but it and other tech giants continue to operate with little to no oversight. The article states:
“Wylie details in his book how personality profiles mined from Facebook were weaponised to ‘radicalise’ individuals through psychographic profiling and targeting techniques. So great is their potential power over society and people’s lives that tech professionals need to be subject to the same codes of ethics as doctors and lawyers, he told AFP as his book was published in France. ‘Profiling work that we were doing to look at who was most vulnerable to being radicalised … was used to identify people in the US who were susceptible to radicalisation so that they could be encouraged and catalysed on that path,’ he said. ‘You are being intentionally monitored so that your unique biases, your anxieties, your weaknesses, your needs, your desires can be quantified in such a way that a company can seek to exploit that for profit,’ said the 30-year-old. Wylie, who blew the whistle to British newspaper, The Guardian, in Mar 2018, said at least people now realise how powerful data can be.”
As in any industry, tech companies are made up of humans, some of whom are willing to put money over morality. And as in other consequential industries like construction, engineering, medicine, and law, Wylie argues, regulations are required to protect consumers from that which they do not understand.
Cynthia Murrell, April 4, 2020
NSO: Back in the News Again
April 3, 2020
Let’s assume that the Beeb is on the money. “Coronavirus: Israeli Spyware Firm Pitches to Be Covid 19 Saviors” is a bit of British snark. First, the word “coronavirus” is newsy, and it is clickbait. Second, “Israeli spyware pitches” converts the use of specialized software into a carnival barker’s shout. (One might ask, “Why?” I think I know the answer. The British Cervantes is on the gallop perhaps?)
The point of the story which contains some loaded words like “controversial” is that NSO has technology which can assist governments in gathering useful information about the virus. The write up states after the Beeb explains that Facebook and NSO are in a legal wrestling match:
NSO says its employees will not have access to any data, but its software will work best if a government asks local mobile phone operators to provide the records of every subscriber in the country. Each person known to be infected with Covid-19 could then be tracked, with the people they had met and the places they had visited, even before showing symptoms, plotted on a map.
Scary, ominous, Orwellian, something that British government agencies would never, ever in a million years consider.
The reality is that monitoring a population is happening in quite a few countries. Perhaps even merrie olde Land of the Angles?
A news story is okay. Shading the coverage to advance the agenda “NSO is just not such a fine piece of British wool” is unsettling — possibly more so than specialized service firms’ software.
Stephen E Arnold, April 3, 2020
Google: Ever Amusing, Ever Innovative
April 3, 2020
Quick note about two Google services. (If you are a Googler, jump to another project, preferably one with traction.)
First, Google innovators are going to duplicate to some degree the TikTok approach to video. Hopefully the me too service will lack some of the Chinese craftsmanship. The world does need more 30 second videos, just with pre roll, in video, and post roll advertising. Why didn’t Google think of this ad inventory burner quicker? Right, “think.”
Second, Google has learned (finally) that the Nextdoor.com approach is tough to implement in India. (Google’s service is neighborly for a few months more. The Google announced that it will keep its competitive service Neighbourly alive until October 2020, probably with a lone intern keeping the lights on. Like enterprise search, Neighbourly learned that some services require a bit more than Google imagining sales and sustainable revenue. (Hey, those payoff require work by someone.)
To sum up: One innovation arrives; another departs. So life goes for the Google. I was taught to spell neighborly without the stiff upper lip “u.” I know, I know. I am not neighborly and I don’t get the short video thing.
Stephen E Arnold, April 3, 2020
About Those Cloud Services?
April 3, 2020
Okay, Amazon can’t deliver. Microsoft can’t scale. Now Google Cloud Engine just falls over. What were the techno experts saying about those cloud services?
Navigate to to “Google Cloud Engine Outage Caused by Large Backlog of Queued Mutations.” The article reports:
A 14-hour Google cloud platform outage that we missed in the shadow of last week’s G Suite outage was caused by a failure to scale, an internal investigation has shown.
But why?
The outage was caused by a lack of memory in the company’s cache servers…
To simplify. Google’s smart scaling failed. Does this mean that Google and Microsoft are more alike than different? If Amazon can’t deliver, does this mean Google cannot deliver?
About those cloud services powering decision making? Well, sort of.
Stephen E Arnold, April 3, 2020
A Cheerleading Routine for AI
April 3, 2020
We have come across a good example of cheerleading with minimal facts. Rah rah for AI, cries the SmartData Collective in their write-up, “Experts Debunk the Biggest Myths About AI in Business.” Writer Sean Mallon begins by noting how fast the AI market is growing, which is indeed to be expected given recent developments (and hype). He declares the growth is due to businesses that comprehend how powerful a tool AI is. He writes:
“Companies are now increasing the adoption of this technology in a range of different industries, which covers diverse sectors such as healthcare, finance, marketing and more. Through the incorporation of AI, industries have seen major shifts in how they run. While the true potential of AI is now being recognized by businesses from all different sectors, many myths have floated around causing skepticism and unnecessary fear over this transformative technology. If AI is to reach its true potential in businesses across all industries, it’s important to explore, and further debunk, these common misconceptions.”
The piece magnanimously helps any reluctant companies see the light by deflating these “myths:” that AI steals jobs, that AI is hard to integrate, and, most dastardly, that AI may be unnecessary. On that last point, Mallon asserts:
“This is perhaps one of the biggest myths currently circulating around industries today, limiting businesses from unlocking their true potential. AI technology is increasingly becoming a part of daily life, especially in the business sector, boosting its productivity and furthering its growth and success. Companies everywhere are using AI to gain a competitive advantage, helping their business to work smarter and faster than those around them.”
For some, I’m sure that is the case; for others, not so much. Business is just too complex for such absolutes. As always, the best bet is to ignore the hype, know your organization’s needs and the capabilities of available software, and mix and match accordingly.
Cynthia Murrell, April 3, 2020