Divorced by Smart Software and Hopefully Outstandingly Objective Algorithms

July 13, 2020

Amica, someone’s pal. A divorce adjudicated by smart software.

We are not sure this is a good idea. Fossbytes reports, “Australian Governments Roll Out Amica AI for Settling Divorces.” Can an algorithm replace human arbitration in a heated divorce? Apparently, the Aussies in charge believe it can. Writer Nishit Raghuwanshi explains:

“The Australian government has rolled out an AI named Amica that will help the partners in dividing money and property. Moreover, the AI will also help in making appropriate parenting arrangements without hiring a lawyer. As reported in Gizmodo, Australian AG Christian Porter mentioned that the Australian government is trying its best to improvise the Australian family law system. The main priority of the government is to make the system a bit more fast and cheap. He concluded his statement by saying that the government is also working on making the divorce process less stressful for the partners and their children.

“As per the stats, most of the Australian couples were inclined towards dumping their partners owing to Coronavirus quarantine period. It is expected that just after a little relaxation from COVID-19, a large number of couples will appear in the court for separation cases.”

Apparently this dynamic means post-pandemic will be the perfect time to put the project into place. Australia’s family courts were already swamped, we’re told, and all this forced togetherness threatens to completely overwhelm them. Any time one partner refuses to accept the AI’s recommendations, however, a lawyer will still be required. So If the algorithm is not good at its job the court system may not see much relief. Currently, the tool is free to Australian citizens, but a fee between $113 and $303 will be enacted next year.

DarkCyber wonders if the system was developed by an objective humanoid, hopefully one unaffected by a parental dust up. Revenge? Maybe?

Cynthia Murrell, July 13, 2020

Possibly Harmful Smart Software: Heck, Release It

July 13, 2020

So, what changed? A brief write-up at Daijiworld reports that the “Elon Musk-Founded OpenAI Releases Text Tool it Once Called Dangerous.” This software can rapidly generate fake news that is so believable its makers once deemed it too dangerous to be released. Now, though, OpenAI seems to have had a change of heart. The API is being released in a private beta rather than into to the world at large as a test run of sorts. Citing an OpenAI blog post, the write-up reveals:

“‘In releasing the API, we are working closely with our partners to see what challenges arise when AI systems are used in the real world,’ OpenAI said in a blog post last week. ‘This will help guide our efforts to understand how deploying future AI systems will go, and what we need to do to make sure they are safe and beneficial for everyone.’ The API that OpenAI finally decided to release provides a general-purpose ‘text in, text out’ interface, allowing users to try it on virtually any English language task. Interested buyers can integrate the API into their product and develop an entirely new application. ‘Given any text prompt, the API will return a text completion, attempting to match the pattern you gave it. You can “program” it by showing it just a few examples of what you’d like it to do; its success generally varies depending on how complex the task is,’ OpenAI said.”

Where will it go from here—will OpenAI decide a general release is worth the risk? We’re guessing it will. Evidently this software is just too juicy to keep under wraps.

Cynthia Murrell, June 25, 2020

Do Those Commercial Satellites Just Provide Internet? Maybe Not

July 12, 2020

Much has changed since the early days of the Civil Rights Movement, not the least of which is the state of observation technology. We learn from Bloomberg that “Satellites Are Capturing the Protests, and Just About Everything Else on Earth.” Satellite-captured images of protests pervade recent news coverage, particularly a photo of D.C.’s yellow “Black Lives Matter” street mural captured by Planet Labs, Inc. This company, founded in 2010, brings satellite imagery to the masses. Journalist Ashlee Vance reports:

“The company that took the photo, Planet Labs Inc., has hundreds of satellites floating around Earth, enough that it can snap at least one photo of every spot on the planet every day, according to the startup. Such imagery used to be rare, expensive and controlled by governments. Now, Planet has built what amounts to a real-time accounting system of the earth that just about anyone can access by paying a fee.

Over the next couple months, Planet is embarking on a project that will dramatically increase the number of photos it takes and improve the quality of the images by 25% in terms of resolution. To do that, the company is lowering the orbits of some of its larger, high-resolution satellites and launching a half-dozen more devices. As a result, Planet will go from photographing locations twice a day to as many as 12 times a day in some places. Customers will also be able to aim the satellites where they want using an automated system developed by Planet. ‘The schedule is shipped to the satellite, and it knows the plan it needs to follow,’ said Jim Thomason, the vice president of products at Planet.”

The implications are both amazing and alarming. The very concept of privacy may become hypothetical when anyone willing to pay can see just about anything and anyone, anywhere, at nearly any time. On the other hand, there are more benign possibilities, like the investors who examine parking lots to determine how lucrative certain retail businesses are. And, of course, there is the ability to chronicle a large scale social-justice movement. During the Covid-19 pandemic, analysts have also used satellite imagery to track activity slowdowns, military activity, and shipments of goods.

Planet Labs is not the only private company in the satellite imagery market. Rivals include Capella Space and Iceye. As the competition heats up, how many more objects will be placed into orbit around our planet? As I recall, we already have too much stuff flying around out there. I suppose, though, that concern is beyond the purview of companies looking to cash in on the technology.

Cynthia Murrell, July 12, 2020

Humans Still Needed Despite Young Wizards’ Best Efforts and Confidence

July 9, 2020

It was already true, though many failed to realize it—a company cannot effectively market by AI alone. The Next Web observes, “We Can’t Build Customer Strategies Solely on Algorithms—and the Pandemic Proves it.” Writer Andy MacMillan opens by relating a Covid-era marketing faux pas: a Facebook clothing ad featured Ipanema Beach crowded with young people cavorting, with nary a care in the world. The tagline was “Unique vision of effortless lifestyle.” Oof! During the lockdown, the beach actually looks more like this. That algorithm had not gotten the memo.

“For the last decade, businesses have embraced a belief that ‘the data will tell us what to do.’ But guess what? Data models don’t even exist to properly capture the extent to which the pandemic has altered customers’ emotional landscape. Don’t get me wrong. Analyzing clicks, email response rates, behavioral actions, etc. indeed can be helpful, combing through more information than a human ever could to identify and target customers and provide clues on ways to optimize the buying experience. But even in normal times, an overly data-driven approach leaves out too many nuances around customer experience and often leaves big blind spots that can hurt the business. Now, as COVID-19 heightens consumers’ sensitivities both positively and negatively, these human insights matter more than ever. Businesses that fill a need or show that they care about the safety and well-being of their customers and workers see their brands soar. (Companies like Zoom and Instacart have become virtual heroes, literally.) Those that appear to be taking advantage of the crisis, give customers runarounds, or simply fail to communicate clearly, concisely, and helpfully suffer perhaps irreparable damage.”

And that is why businesses still need human discernment as well as person-to-person discussions with customers. Doing modern business during a pandemic is new to everyone, so there is no appropriate data on which to train AI. It takes human sensitivity to ask the right questions and determine appropriate responses to feedback. Even when and if the world returns to normal, companies would do well to remember this lesson.

Cynthia Murrell, July 9, 2020

The Cost of Training Smart Software: Is It Rising or Falling?

July 6, 2020

I read “The Cost of AI Training is Improving at 50x the Speed of Moore’s Law: Why It’s Still Early Days for AI.” The article’s main point is that “training” — that is, the cost of making machine learning smart — is declining.

That seems to make sense. First, there are cloud services. Some of these are cheaper than others, but, in general, relying on cloud compute eliminates the capital costs and the “ramp up” costs for creating one’s own infrastructure to train machine learning systems.

Second, use of a machine learning “utility” like Amazon AWS Sagemaker or the similar services available from IBM and Google provides two economic benefits:

  1. Tools are available to reduce engineering lift off and launch time
  2. Components like Sagemaker’s off-the-shelf data bundles eliminate the often-tedious process of finding additional data to use for training.

Third, assumptions about smart software’s efficacy appear to support generalizations about the training, use, and deployment of smart software.

I want to =note that there are some research groups who believe that software can learn by itself. If my memory is working this morning, I think the jazzy way to state is “sui generis.” Turn the system on, let it operate, and it learns by processing. For smart software, the crude parallel is learning the way humans learn: What’s in the environment becomes the raw material for learning.

The article correctly points out that the number of training models has increased. That is indeed accurate. A model is a numerical recipe set up to produce an output that meets the modeler’s goal. Thus, training a model involves providing data to the numerical recipe, observing the outputs, and then making adjustments. These “tweaks” can be simple and easy; for example, changing a threshold governing a decision. More complex fixes include, but are not limited to, selecting a different sequence for the individual processes, concatenating models so that multiple outputs inform a decision, and substituting one mathematical component for another. To get a sense of the range of components available to a modeler, a quick look at Algorithms. This collection is what I would call “ready to run.”

The article includes a number of charts. Each of these presents data supporting the argument that it is getting less costly to training smart software.

I am not certain I agree, although the charts seem to support the argument.

I want to point out that there are some additional costs to consider. A few of these can be “deal breakers” for financial and technical reasons.

Here’s my list of smart software costs. As far as I know, none of these has been the subject of an analyst’s examination and some may be unquantified because those in the business of smart software are not set up to capture them:

  1. Retraining. Anyone with experience with models knows that retraining is required. There are numerous reasons, but retraining is often more expensive than the first set of training activities.
  2. Gathering current or more on point training data. The assumption about training data is that it is useful. We live in the era of so called big data. Unfortunately on point data relevant to the retraining task is a time consuming and can be a complicated task involving subject matter experts.
  3. Data normalization. There is a perception that if data are digital, those data can be provided “as is” to a content processing system. That is not entirely accurate. The normalization processes can easily consume as much as 60 percent of available subject matter expert and data analysts’ time.
  4. Data validation. The era of big data makes possible this generalization, “The volume of data will smooth out any anomalies.” Maybe, but in my experience, the “anomalies” — if not addressed — can easily skew one of the ingredients in the numerical recipe so that the outputs are not reliable. The output may “look” like it is accurate. In real life, the output is not what’s desired. I would refer the reader to the stories about Detroit’s facial recognition system which is incorrect 96 percent of the time. For reference, see this Ars Technica article.
  5. Downstream costs. Let’s use the Detroit police facial recognition system to illustrate this cost. Answer this question, please, “What are the fully loaded costs for the consequences of the misidentification of a US citizen?”

In my view, taking a narrow look at the costs of training smart software is not in the interests of the analyst who benefits from handling investors’ money. Nor are the companies involved in smart software eager to monitor the direct and indirect costs associated with training the models. Finally, it is in no one’s interest to consider the downstream costs of a system which may generate inaccurate outputs.

Net net: In today’s economic environment, ignoring the broader cost picture is a distortion of what it takes to train and retrain smart software.

Stephen E Arnold, July 6, 2020

Mathiness: Better Than Hunan Chicken?

July 6, 2020

I am thrilled when one of my math oriented posts elicits clicks and feedback. Some still care about mathematics. Yippy do.

I read “Why China’s Race for AI Dominance Depends on Math.” The article comes from one of those high-toned online publications of mystical origins and more mythy financial resources.

The main point of the article is that China may care more about numbers than Hunan chicken. I noted this statement:

Dozens of think tank projects and government reports won’t mean anything if Americans can’t maintain mastery over the fundamental mathematics that underpin AI.

The write up disputes the truism “it’s all about the data.” The article stated:

Yet without the right type of math, and those who can creatively develop it, all the data in the world will only take you so far

Now that’s an observation which undercuts what some might call “collect it all” thinking. The idea is that the nugget is in “there” somewhere. And at some point in time systems and software will “discover” or “reveal” what a particular person needs to complete a task. That task may be the answer to the question, “What stock can I buy cheap today to make a lot of money tomorrow?” to “Who helped Robert Maxwell’s extremely interesting daughter hide in New Hampshire?”

Years ago I was on the advisory panel for a company called NuTech Solutions. The founder and a couple of his relatives focused on applying a philosophical concept to predictive methods. The company developed a search system, a method for solving traveling sales person-type problems, and a number of other common computational chestnuts. The methods ranged from smart software to old-fashioned statistical procedures applied in novel ways.

Tough sell as it turned out. On one call in which I participated, I remember this exchange:

Prospective Customer: Would you tell us how your system works?

President of NuTech: Now I think we will not make a sale.

Prospective Customer: Why is that?

President of NuTech: I have to write down equations, and we need to talk about them.

Yep, math for some is not about equations. Math is buzzwords. I mentioned to a college medical analytics professor who asked me a question about what I was working on. I replied, “I have been thinking about Hopf fibration.”

Crickets. He changed the subject.

The write up (somewhat gleefully) it seemed to me, stated:

American secondary school and university students are not mastering the fundamental math that prepares them to move into the type of advanced fields, such as statistical theory and differential geometry, that makes AI possible. American fifteen-year-olds scored thirty-fifth in math on the OECD’s 2018 Program for International Student Assessment tests—well below the OECD average. Even at the college level, not having mastered the basics needed for rigorous training in abstract problem solving, American students are often mostly taught to memorize algorithms and insert them when needed.

If true (and I have only anecdotal evidence obtained by watching young people try to make change at Walgreen’s), the idea that “insert them” is going to create some crazier stuff than Google selling ads for fast food next to a video about losing weight.

My team and I did a job for the University of Michigan before I retired. The project was to provide an outsider’s view of what could be done to make the university rank higher in math, computer science, and related disciplines. We gathered data; we interviewed; and we did on site observations. We did many things. One fact jumped out. There were not too many Americans in the advanced classes. Plus, the very best students in the advanced programs stayed in lovely Michigan. Thus, instead of setting up a business near the university, there folks headed to better weather and a more favorable venture capital climate. Yikes. These are tough problems for a university to fix easily and maybe not be able to remediate in a significant way. Good news? Yep, I got paid.

The essay grinds forward with the analysis. The essay ended with this statement:

Winning the AI competition begins by acknowledging how poorly we do in attracting and training Americans in math at all levels. Without getting serious about the remedy, the AI race may be lost as clearly as two plus two equals four.

Now think about this article’s message in the context of no code or low code programming, one click output of predictive reports based on real time data flows, or deciding what numerical recipe to plug into a business dashboard for real deciders.

Outstanding work. Those railroad cars in Texas. Just a glitch in the system. The “glitch” may be a poor calculation. Guessing might yield better results in some circumstances. Why? Yikes, the answer requires equations and that’s a deal breaker in some situations. Just use a buzzword.

Stephen E Arnold, July 6, 2020

Math and Smart Software Ethicality

July 5, 2020

I noted “Mathematical Principle Could Help Unearth Unethical Choices of AI.” The idea is that a numerical recipe runs when the smart software developer trains the model or lattice of models. The paper states:

Our suggested ‘Unethical Optimization Principle’ can be used to help regulators, compliance staff, and others to find problematic strategies that might be hidden in large strategy space. Optimization can be expected to choose disproportionately many unethical strategies, an inspection of which should show where problems are likely to arise and thus suggest how the AI search algorithm should be modified to avoid them in the future. The Principle also suggests that it may be necessary to re-think the way AI operates in very large strategy spaces, so that unethical outcomes are explicitly rejected in the optimization/learning process.

Several observations:

First, does the method “work” in the murky world of smart software; that is, some smart software is designed specifically to generate revenue. The “training” increases the likelihood that the smart software will deliver the results; for example, increased ad revenue.

Second, what happens if the developers and subject matter experts ignore the proposed numerical recipe? Answer: The algorithm will perform based on the training it receives. The purpose of the smart algorithm is to deliver what may be to some an un-ethical result.

Third, what if the proposed numerical recipes itself identifies an “ethical” action as “un-ethical”?

To sum up, interesting idea. Some work may be needed before the cheerleading commences.

Stephen E Arnold, July 5, 2020

Oh, Oh, Somebody Has Blown the Whistle on the Machine Learning Fouls

July 3, 2020

Wonder why smart software is often and quite spectacularly stupid? You can get a partial answer in “On Moving from Statistics to Machine Learning, the Final Stage of Grief.” There’s some mathiness in the write up. However, the author who tries to stand up to heteroskedastic errors, offers some useful explanations and good descriptions of the short cuts some of the zippy machine learning systems take.

Here’s a passage I found interesting:

As you can imagine, machine learning doesn’t let you side-step the dirty work of specifying your data and models (a.k.a. “feature engineering,” according to data scientists), but it makes it a lot easier to just run things without thinking too hard about how to set it up. In statistics, bad results can be wrong, and being right for bad reasons isn’t acceptable. In machine learning, bad results are wrong if they catastrophically fail to predict the future, and nobody cares much how your crystal ball works, they only care that it works.

Also this statement:

I like showing ridge regression as an example of machine learning because it’s very similar to OLS, but is totally and unabashedly modified for predictive purposes, instead of inferential purposes.

One problem is that those individuals who most need to understand why smart software is stupid are likely to struggle to understand this quite helpful explanation.

Math understanding is the problem. That lack of mathiness is why smart software is likely to remain like a very large, eager wet Newfoundland water dog shaking in the kitchen. Yep, the hairy beast is an outlier heteroskedastically speaking, of course.

Stephen E Arnold, July 3, 2020

IBM Donates Projects to the Cause of Responsible AI

July 3, 2020

The first question arising was, “Was the marketing of Watson responsible?” But why rain on a virtue signaling parade? It is almost the 4th of July in IBM land?

The LF AI Foundation was formed to support open source innovation in artificial intelligence, machine learning, and deep learning. Now IBM has climbed on board, we learn from “IBM Donates ‘Trusted AI’ Projects to Linux Foundation AI” at ZDNet. In a blog post, the company promises these donations will help ensure AI deployments are fair, secure, and trustworthy. They will also facilitate the creation of such software by the open source community under the direction of the Linux Foundation. Journalist Stephanie Codon writes:

“Specifically, IBM is contributing the AI Fairness 360 Toolkit, the Adversarial Robustness 360 Toolbox and the AI Explainability 360 Toolkit. The AI Fairness 360 Toolkit allows developers and data scientists to detect and mitigate unwanted bias in machine learning models and datasets. Along with other resources, it provides around 70 metrics to test for biases and 11 algorithms to mitigate bias in datasets and models. The Adversarial Robustness 360 Toolbox is an open-source library that helps researchers and developers defend deep neural networks from adversarial attacks. Meanwhile, the AI Explainability 360 Toolkit provides a set of algorithms, code, guides, tutorials, and demos to support the interpretability and explainability of machine learning models. The LFAI’s Technical Advisory Committee voted earlier this month to host and incubate the project, and IBM is currently working with them to formally move them under the foundation. IBM joined the LFAI last year and helped established its Trusted AI Committee, which is working towards defining and implementing principles of trust in AI deployments.”

Plus a foundation can deal with any political or legal issues, perhaps? The article notes that governments are taking a serious interest in AI governance. The EU released a white paper on the topic in February, and 14 countries and the EU are teaming up in the Global Partnership on Artificial Intelligence (GPAI). It is about time governing bodies woke up to the effects unchecked AI can have on our communities. Now about the Watson Covid, the avocado festival, and the game show?

Cynthia Murrell, July 3, 2020

MIT and Being Smart

July 3, 2020

When I hear “MIT”, I think Jeffrey Epstein. Sorry. Imprinting at work. I read “MIT Apologizes, Permanently Pulls Offline Huge Dataset That Taught AI Systems to Use Racist, Misogynistic Slurs.” Yep, that the MIT which trains smart people today.

The write up reports:

Vinay Prabhu, chief scientist at UnifyID, a privacy startup in Silicon Valley, and Abeba Birhane, a PhD candidate at University College Dublin in Ireland, pored over the MIT database and discovered thousands of images labeled with racist slurs for Black and Asian people, and derogatory terms used to describe women. They revealed their findings in a paper undergoing peer review for the 2021 Workshop on Applications of Computer Vision conference.

Presumably the demise of Mr. Epstein prevented him from scrutinizing the dataset for appropriate candidates.

Error corrected. Apology emitted. Another outstanding example of academic excellence engraved in digital history.

Stephen E Arnold, July 3, 2020

Next Page »

  • Archives

  • Recent Posts

  • Meta