Smart Software: What Is Wrong?

April 8, 2020

We have the Google not solving death. We have the IBM Watson thing losing its parking spot at a Houston cancer center. We have a Department of Justice study reporting issues with predictive analytics. And, the supercomputer and their smart software have not delivered a solution to the coronavirus problem. Yep. What’s up?

Data Science: Reality Doesn’t Meet Expectations” explains some of the reasons. DarkCyber recommends this write up. The article provides seven reasons why the marketing fluff generated by  former art history majors for “bros” of different ilk are not delivering; to wit:

  1. People don’t know what “data science” does.
  2. Data science leadership is sorely lacking.
  3. Data science can’t always be built to specs.
  4. You’re likely the only “data person”
  5. Your impact is tough to measure — data doesn’t always translate to value
  6. Data & infrastructure have serious quality problems.
  7. Data work can be profoundly unethical. Moral courage required.

DarkCyber has nothing to add.

Stephen E Arnold, April 8, 2020

Will More Big Data Make AI Deliver Results

April 6, 2020

Many companies have issued news releases about their coronavirus research support. Personally I find the majority of these “real news” announcements low ball marketing at its finest. The coronavirus problem is indeed serious, and researchers, art history majors, and MBA executives who hop on the “We are helping” bandwagon are amusing.

I read a 3,500 ZDNet article titled:

AI Runs Smack Up Against a Big Data Problem in COVID-19 Diagnosis. Researchers around the world have quickly pulled together combinations of neural networks that show real promise in diagnosing COVID-19 from chest X-rays and CT scans. But a lack of data is hampering the ability of many efforts to move forward. Some kind of global data sharing may be the answer.

Now that’s an SEO inspired title, but the write up makes one amazing assertion: More data will allow medical AI systems to output actionable information.

If I run through the litany of medical AI revolutions, my fingers would get tired clicking and mousing. The IBM Watson silliness is a good example, and it encapsulates the problem of using collections of numerical recipes to help physicians deal with cancer. Google has not made much, if any, progress on solving death. Remember that “hard problem.” Pushing deeper into the past there was NuTech Solutions’ ability to identify individuals likely to get diabetes based on sparse data and ant algorithms.

How did these companies’ efforts work out?

Failures from my point of view.

The write up runs down a number of research efforts. Companies like DarwinAI are mentioned. There are quotes which provide guidance to organizations challenged to find the snack room; for example:

“I think it would help if the WHO made a central database with de-identifying mechanisms, and some really good encryption,” said Dr. Luccioni. “That way, local health authorities would be reassured and motivated to share their data with each other.”

The problem is that smart software is mostly implementation of methods known in some cases for hundreds of years. These smart systems use recursion, feedback loops, and statistical procedures to output statistically valid (probable) information.

How are these systems working? There are data, but they are conflicting, disorganized, and inconsistent. News flash. That’s how information is. There is zero evidence that more data can be verified, normalized, processed in near real time to allow smart software to demonstrate it can do more than generate marketing collateral.

The companies pitching their artificial intelligence should articulate the reality of the outputs their workflows of algorithms can actually generate.

That might help more than the craziness of wanting data to be better, having some magic wand to normalize the messy real world of information, and converting what are mostly graduate school projects into something useful beyond speeding up some lab tests and getting a “real” job.

Will this happen? Not for a long time. Data are not the problem. Humans are the problem because the idea of creating a consistent, verified repository of on point data has not been achieved for small domains of content. Forget global data.

Don’t believe me. Check out any online system. Run some queries. Is “everything” in that system or federated system? What about a small collection of data; for example, the data on your mobile? What’s there that you can access? “What?” you ask. Yeah, the high value data are sucked away and those data are not shared with “everyone” including you who created the data in the first place.

Smart software performs some useful functions. Will data make Bayesian methods or patented techniques like those from Qure.ai “solve” Covid?

Hard in reality. Easy in ZDNet articles. Even easier for marketers. And the patients suffering? What? Who? Where?

Exactly.

Stephen E Arnold, April 5, 2020

Techspert: Search and Experts

April 6, 2020

How Our AI Search Technology Finds Experts Others Can’t” provides a crunchy description about an application of artificial intelligence. Techspert.io provides a diagram of its approach:

techspert small

The idea is that the approach operates with pinpoint precision. Then a semantic search engine is used to identify context. The old school lingo was Endeca’s Guided Search or maybe side search. Then a social graph is generated. That’s a relationship map like those used by i2 Ltd’s Analysts Notebook in the early 1990s. The i2 Ltd outfit had some Cambridge grads on its team. Finally the system can identify candidates.

What’s interesting is that the pinpoint angle appears to focus on a narrow domain; that is, individuals in STM with a focus on the M (medicine, biotechnology, etc.). This approach reduces the difficulty of indexing for any business or technical discipline. Focus means that descriptive terms are narrower than general business lingo. Second, the crawling for specialized personnel becomes somewhat easier because many sites can be ignored because they are not related to medicine and related fields; for example, the garden gnome site www.designsoscano.com. Plus, the social graph complexity can be reduced by applying qualifiers that NOT out individuals and other entities unrelated to the focus of Techspert.io; for example, David Drummond and Jennifer Blakely.

Several observations are warranted:

  1. The implemented method is useful when deployed in a focused way; that is, vertical search for different “terminologies”.
  2. Scaling the approach across different content domains may require innovative engineering. And the engineering solutions will be expensive to implement, update, and enhance.
  3. Generating market magnetism will require effective marketing and sales programs. Business development must generate sufficient revenue because once certain hires are made by a company, the recruiting service is put on ice; and sustainable revenues will have to come from recruiting services which offer lower costs, perquisites to customers, etc. These factors may inhibit some venture cash investments.

Worth monitoring this firm. A pivot may be necessary due to the uncertain economic environment.

Stephen E Arnold, April 6, 2020

A Cheerleading Routine for AI

April 3, 2020

We have come across a good example of cheerleading with minimal facts. Rah rah for AI, cries the SmartData Collective in their write-up, “Experts Debunk the Biggest Myths About AI in Business.” Writer Sean Mallon begins by noting how fast the AI market is growing, which is indeed to be expected given recent developments (and hype). He declares the growth is due to businesses that comprehend how powerful a tool AI is. He writes:

“Companies are now increasing the adoption of this technology in a range of different industries, which covers diverse sectors such as healthcare, finance, marketing and more. Through the incorporation of AI, industries have seen major shifts in how they run. While the true potential of AI is now being recognized by businesses from all different sectors, many myths have floated around causing skepticism and unnecessary fear over this transformative technology. If AI is to reach its true potential in businesses across all industries, it’s important to explore, and further debunk, these common misconceptions.”

The piece magnanimously helps any reluctant companies see the light by deflating these “myths:” that AI steals jobs, that AI is hard to integrate, and, most dastardly, that AI may be unnecessary. On that last point, Mallon asserts:

“This is perhaps one of the biggest myths currently circulating around industries today, limiting businesses from unlocking their true potential. AI technology is increasingly becoming a part of daily life, especially in the business sector, boosting its productivity and furthering its growth and success. Companies everywhere are using AI to gain a competitive advantage, helping their business to work smarter and faster than those around them.”

For some, I’m sure that is the case; for others, not so much. Business is just too complex for such absolutes. As always, the best bet is to ignore the hype, know your organization’s needs and the capabilities of available software, and mix and match accordingly.

Cynthia Murrell, April 3, 2020

Virtual China: Beefing Up

April 3, 2020

I want to keep this brief. “Tencent to Build AI Supercomputing Center, Industrial Base in Shanghai.” So what’s new? The write up states:

The internet titan and the city’s Songjiang district government signed an deal today to deepen collaboration in areas such as AI…

DarkCyber noted this checklist:

The center will undertake various large-scale AI algorithm calculations, machine learning, image processing, and scientific and engineering computing tasks based on Tencent’s AI capabilities, and provide cloud computing services to the whole of society with data processing and storage capabilities…

Edge computing? Smart manufacturing? Intercept and data analytics?

Check, check, check.

Stephen E Arnold, April 3, 2020

A Cheerleading Routine for AI

April 2, 2020

We have come across a good example of cheerleading with minimal facts. Rah rah for AI, cries the SmartData Collective in their write-up, “Experts Debunk the Biggest Myths About AI in Business.” Writer Sean Mallon begins by noting how fast the AI market is growing, which is indeed to be expected given recent developments (and hype). He declares the growth is due to businesses that comprehend how powerful a tool AI is. He writes:

“Companies are now increasing the adoption of this technology in a range of different industries, which covers diverse sectors such as healthcare, finance, marketing and more. Through the incorporation of AI, industries have seen major shifts in how they run. While the true potential of AI is now being recognized by businesses from all different sectors, many myths have floated around causing skepticism and unnecessary fear over this transformative technology. If AI is to reach its true potential in businesses across all industries, it’s important to explore, and further debunk, these common misconceptions.”

The piece magnanimously helps any reluctant companies see the light by deflating these “myths:” that AI steals jobs, that AI is hard to integrate, and, most dastardly, that AI may be unnecessary. On that last point, Mallon asserts:

“This is perhaps one of the biggest myths currently circulating around industries today, limiting businesses from unlocking their true potential. AI technology is increasingly becoming a part of daily life, especially in the business sector, boosting its productivity and furthering its growth and success. Companies everywhere are using AI to gain a competitive advantage, helping their business to work smarter and faster than those around them.”

For some, I’m sure that is the case; for others, not so much. Business is just too complex for such absolutes. As always, the best bet is to ignore the hype, know your organization’s needs and the capabilities of available software, and mix and match accordingly.

Cynthia Murrell, April 2, 2020

Nervous about AI? Google Uses It and You Do Too

April 2, 2020

Despite the deployment of smart speakers, virtual assistants, language translation automation, and many other technologies we use every day, AI still feels like a future innovation. We are probably stuck on the idea that AI means walking, talking robots, but AI, in fact, is already part of our daily lives. Techni Pages wrote, “5 Uses Of Advanced AI Already Being Used By Google” to demonstrate how AI is currently being used.

Have you ever sent a text message using the voice-to-text feature on your mobile phone? Surprise, that is a form AI! Human language is very complex and in order for machines to understand it, Google uses Deep Neural Networks to model language sounds. Current endeavors have designed voice-to-text to be faster, siphon out more noise, and more accurate.

Google Maps is another huge AI project. Powered by real time predictions, Google Maps delivers the fastest route to destinations. It takes into consideration accidents, traffic, and constructions so users can avoid those hindrances. The Google Assistant is another AI tool that acts as your own personal assistant to perform Internet searches, schedule appointments, set reminders, and make simple notes. Gmail also uses AI to categorize emails and filter spam from your inbox.

Google offers the Cloud AutoML too:

“The Cloud AutoML is an advanced AI that helps developers to create other AI smart solutions. The machine learning models are of high quality and enable developers to create AI that suits their business needs. Cloud AutoML has state-of-the-art performance and also enables the machine learning to happen with minimal effort since it uses neural architecture search technology and transfer learning.”

Google is an industry leader in developing innovative AI tools. The AI tools we use might not be robots, but they are very helpful.

Whitney Grace, April 2, 2020

Duh Report: Smart Software Creates Change

March 28, 2020

Another report from the edge of the obvious:

New technology changes lives. Duh.

Not exactly a news flash. But some are surprised. Ali Jazeera explores how artificial intelligence is changing modern society in the article, “Dataland: The Evolution Of Artificial Intelligence And Big Data.”

Classic science fiction generally takes an analog approach to futuristic technology as the concept of a digital landscape was not in the human scope. Our digital data is as identifiable as our fingerprints and different organizations use it to track us. In democracies, it is mostly used to sell products with targeted ads, while authoritarian governments use it to track their citizens’ locations and habits.

Dataland is a documentary that tracks how AI is used in different countries:

Dataland illustrates the different facets of big data and artificial intelligence being unleashed by the world’s most prolific data scientists. The film goes to Dublin where artificial intelligence is becoming an increasing influence on community life; to Finland where citizens transmit their DNA to improve public health and predictive medicine; and finally to China where facial recognition is routinely used by the state to track the movement, habits and private lives of common people.”

It is inspiring and startling to see how different societies use AI. We literally can only imagine how AI will be used next, then the technologists will make it a reality. It is only a matter of time (years or decades?) before AI is as common as mobile devices.

Interesting source too.

Whitney Grace, March 27, 2020

Contact Tracing: A Tradecraft Component Released as Open Source Software

March 25, 2020

DarkCyber does not want to beat the drum about keeping some information from finding its way into general circulation. We want to point to “Singapore Government to Make Its Contact Tracing App Freely Available to Developers Worldwide.” The article states:

the Government [of Singapore] will be making the software for its contact-tracing application TraceTogether, which has already been installed by more than 620,000 people, freely available to developers around the world.

With the code in open source, those with some technical skill can develop, enhance, expand, and implement some of the features of TraceTogether.

image

The article points out:

the TraceTogether app can identify people who have been within 2m of coronavirus patients for at least 30 minutes, using wireless Bluetooth technology.

The article includes a how to graphic. The method revealed in the diagram, in the opinion of DarkCyber, seems similar to specialized tools available but in close hold mode for a number of years.

DarkCyber chooses to let the article speak for itself and you, gentle reader, to formulate your own upsides and downsides to the information disclosed by the Straits Times.

Stephen E Arnold, March 25, 2020

Poisoning Smart Software: More Than Sparkley Sunglasses

March 22, 2020

DarkCyber noted “FYI: You Can Trick Image-Recog AI into, Say, Mixing Up Cats and Dogs – by Abusing Scaling Code to Poison Training Data.” The article provides some information about a method “to subvert neural network frameworks so they misidentify images without any telltale signs of tampering.”

Kudos to the Register for providing links to the papers referenced in the article: “Adversarial Preprocessing: Understanding and preventing Image Scaling Attacks in Machine Learning” and “Backdooring and Poisoning Neural Networks with Image Scaling Attacks.”

The Register article points out:

Their key insight is that algorithms used by AI frameworks for image scaling – a common preprocessing step to resize images in a dataset so they all have the same dimensions – do not treat every pixel equally. Instead, these algorithms, in the imaging libraries of Caffe’s OpenCV, TensorFlow’s tf.image, and PyTorch’s Pillow, specifically, consider only a third of the pixels to compute scaling.

DarkCyber wants to point out:

  • The method can be implemented by bad actors seeking to reduce precision of certain types of specialized software. Example: Compromising Anduril’s system
  • Smart software is vulnerable to training data procedures. Some companies train once and forget it. Smart software can drift even with well crafted training data.
  • Information which may have national security implications finds its way into what seems to be a dry, academic analysis. If one does not read these papers, is it possible for one to be unaware of impending or actual issues.

Net net: Cutting corners on training or failing to retrain systems is a problem. However, failing to apply rigor to the entire training process does more than reduce the precision of outputs. Systems simply fail to deliver what users assume a system provides.

Stephen E Arnold, March 22, 2020

Next Page »

  • Archives

  • Recent Posts

  • Meta