Analytics: From Predictions to Prescriptions
October 19, 2018
I read an interesting essay originating at SAP. The article’s title: “The Path from Predictive to Prescriptive Analytics.” The idea is that outputs from a system can be used to understand data. Outputs can also be used to make “predictions”; that is, guesses or bets on likely outcomes in the future. Prescriptive analytics means that the systems tell or wire actions into an output. Now the output can be read by a human, but I think the key use case will be taking the prescriptive outputs and feeding them into other software systems. In short, the system decides and does. No humans really need be involved.
The write up states:
There is a natural progression towards advanced analytics – it is a journey that does not have to be on separate deployments. In fact, it is enhanced by having it on the same deployment, and embedding it in a platform that brings together data visualization, planning, insight, and steering/oversight functions.
What is the optimal way to manage systems which are dictating actions or just automatically taking actions?
The answer is, quite surprisingly, a bit of MBA consultantese: Governance.
The most obvious challenge with regards to prescriptive analytics is governance.
Several observations:
- Governance is unlikely to provide the controls which prescriptive systems warrant. Evidence is that “governance” in some high technology outfits is in short supply.
- Enhanced automation will pull prescriptive analytics into wide use. The reasons are one you have heard before: Better, faster, cheaper.
- Outfits like the Google and In-Q-Tel funded Recorded Future and DarkTrace may have to prepare for new competition; for example, firms which specialize in prescription, not prediction.
To sum up, interesting write up. perhaps SAP will be the go to player in plugging prescriptive functions into their software systems?
Stephen E Arnold, October 19, 2018
Free Data Sources
October 19, 2018
We were plowing through our research folder for Beyond Search. We overlooked the article “685 Outstanding Free Data Sources For 2017.” If you need a range of data sources related to such topics as government data, machine learning, and algorithms, you might want to bookmark this listing.
Stephen E Arnold, October 19, 2018
Algorithms Are Neutral. Well, Sort of Objective Maybe?
October 12, 2018
I read “Amazon Trained a Sexism-Fighting, Resume-Screening AI with Sexist Hiring data, So the Bot Became Sexist.” The main point is that if the training data are biased, the smart software will be biased.
No kidding.
The write up points out:
There is a “machine learning is hard” angle to this: while the flawed outcomes from the flawed training data was totally predictable, the system’s self-generated discriminatory criteria were surprising and unpredictable. No one told it to downrank resumes containing “women’s” — it arrived at that conclusion on its own, by noticing that this was a word that rarely appeared on the resumes of previous Amazon hires.
Now the company discovering that its smart software became automatically biased was Amazon.
That’s right.
The same Amazon which has invested significant resources in its SageMaker machine learning platform. This is part of the infrastructure which will, Amazon hopes, will propel the US Department of Defense forward for the next five years.
Hold on.
What happens if the system and method produces wonky outputs when a minor dust up is automatically escalated?
Discriminating in hiring is one thing. Fluffing a global matter is a another.
Do the smart software systems from Google, IBM, and Microsoft have similar tendencies? My recollection is that this type of “getting lost” has surfaced before. Maybe those innovators pushing narrowly scoped rule based systems were on to something?
Stephen E Arnold, October 11, 2018
Smart Software: There Are Only a Few Algorithms
September 27, 2018
I love simplicity. The write up “The Algorithms That Are Currently Fueling the Deep Learning Revolution” certainly makes deep learning much simpler. Hey, learn these methods and you too can fire up your laptop and chop Big Data down to size. Put digital data into the digital juicer and extract wisdom.
Ah, simplicity.
The write up explains that there are four algorithms that make deep learning tick. I like this approach because it does not require one to know that “deep learning” means. That’s a plus.
The algorithms are:
- Back propagation
- Deep Q Learning
- Generative adversarial network
- Long short term memory
Are these algorithms or are these suitcase words?
The view from Harrod’s Creek is that once one looks closely at these phrase one will discover multiple procedures, systems and methods, and math slightly more complex than tapping the calculator on one’s iPhone to get a sum. There is, of course, the issue of data validation, bandwidth, computational resources, and a couple of other no-big-deal things.
Be a deep learning expert. Easy. Just four algorithms.
Stephen E Arnold, September 27, 2018
IBM Embraces Blockchain for Banking: Is Amazon in the Game Too?
September 9, 2018
IBM recently announced the creation of LedgerConnect, a Blockchain powered banking service. This is an interesting move for a company that previously seemed to waver on whether it wanted to associate with this technology most famous for its links to cryptocurrency. However, the pairing actually makes sense, as we discovered in a recent IT Pro Portal story, “IBM Reveals Support Blockchain App Store.”
According to an IBM official:
“On LedgerConnect financial institutions will be able to access services in areas such as, but not limited to, know your customer processes, sanctions screening, collateral management, derivatives post-trade processing and reconciliation and market data. By hosting these services on a single, enterprise-grade network, organizations can focus on business objectives rather than application development, enabling them to realize operational efficiencies and cost savings across asset classes.”
This, in addition, to recent news that some of the biggest banks on the planet are already using Blockchain for a variety of needs. This includes the story that the Agricultural Bank of China has started issuing large loans using the technology. In fact, out of the 26 publicly owned banks in China, nearly half are using Blockchain. IBM looks pretty conservative when you think of it like that, which is just where IBM likes to be.
Amazon supporst Ethereum, HyperLedger, and a host of other financial functions. For how long? Years.
Patrick Roland, September 9, 2018
Algorithms Can Be Interesting
September 8, 2018
Navigate to “As Germans Seek News, YouTube Delivers Far-Right Tirades” and consider the consequences of information shaping. I have highlighted a handful of statements from the write up to prime your critical thinking pump. Here goes.
I circled this statement in true blue:
…[a Berlin-based digital researcher] scraped YouTube databases for information on every Chemnitz-related video published this year. He found that the platform’s recommendation system consistently directed people toward extremist videos on the riots — then on to far-right videos on other subjects.
I noted:
[The researcher] found that the platform’s recommendation system consistently directed people toward extremist videos on the riots — then on to far-right videos on other subjects.
The write up said:
A YouTube spokeswoman declined to comment on the accusations, saying the recommendation system intended to “give people video suggestions that leave them satisfied.”
The newspaper story revealed:
Zeynep Tufekci, a prominent social media researcher at the University of North Carolina at Chapel Hill, has written that these findings suggest that YouTube could become “one of the most powerful radicalizing instruments of the 21st century.”
With additional exploration, the story asserts a possible mathematical idiosyncrasy:
… The YouTube recommendations bunched them all together, sending users through a vast, closed system composed heavily of misinformation and hate.
You may want to read the original write up and consider the implications of interesting numerical recipes’ behavior.
Smart Software: Just Keep Adding Layers between the Data and the Coder
September 6, 2018
What could be easier? Clicking or coding.
Give up. Clicking wins.
A purist might suggest that training smart software requires an individual with math and data analysis skills. A modern hippy dippy approach is to suggest that pointing and clicking is the way of the future.
Amazon is embracing that approach and other firms are too.
I read “Baidu Launches EZDL, an AI Model Training Platform That Requires No Coding Experience.” Even in China where technical talent is slightly more abundant than in Harrod’s Creek, Kentucky, is on the bandwagon.
I learned:
Baidu this week launched an online tool in beta — EZDL — that makes it easy for virtually anyone to build, design, and deploy artificial intelligence (AI) models without writing a single line of code.
Why slog through courses? Point and click. The future.
There’s not much detail in the write up, but I get the general idea of what’s up from this passage from the write up:
To train a model, EZDL requires 20-100 images, or more than 50 audio files, assigned to each label, and training takes between 15 minutes and an hour. (Baidu claims that more than two-thirds of models get accuracy scores higher than 90 percent.) Generated algorithms can be deployed in the cloud and accessed via an API, or downloaded in the form of a software development kit that supports iOS, Android, and other operating systems.
Oh, oh. The “API” buzzword is in the paragraph, so life is not completely code free.
Baidu, like Amazon, has a bit of the competitive spirit. The write up explains:
Baidu’s made its AI ambitions clear in the two years since it launched Baidu Brain, its eponymous platform for enterprise AI. The company says more than 600,000 developers are currently using Brain 3.0 — the newest version, released in July 2018 — for 110 AI services across 20 industries.
What could go wrong? Nothing, I assume. Absolutely nothing.
Stephen E Arnold, September 6, 2018
Online with Smart Software, Robots, and Obesity
September 1, 2018
I recall a short article called “A Starfish-Killing, Artificially Intelligent Robot Is Set to Patrol the Great Barrier Reef.”The story appeared in 2016. I clipped this item a few days ago: “Centre for Robotic Vision Uses Bots to Cull Starfish.” The idea is that environmental protection becomes easier with killer robots.
Now combine that technology application with “Artificial Intelligence Spots Obesity from Space.” The main idea is that smart software can piece together items of data to figure out who is fat and where fat people live.
What happens if a clever tinkerer hooks together robots which can take action to ensure termination with smart software able to identify a target.
I mention this technology confection because the employees who object to an employer’s technology may be behind the curve. The way technology works is that innovations work a bit like putting Lego blocks together. Separate capabilities can be combined in interesting ways.
Will US employees’ refusal to work on certain projects act like a stuck brake on a rental car?
Worth thinking about before a killer satellite identifies a target and makes an autonomous decision about starfish or other entities. Getting online has interesting applications.
Why search when one can target via algorithms?
Stephen E Arnold, September 1, 2018
A Glimpse of Random
August 30, 2018
I found “The Unreasonable Effectiveness of Quasirandom Sequences” interesting. Random number generators are important to certain cyber analytics systems. The write up puts the spotlight on the R2 method. Without turning a blog post into a math lesson, I want to suggest that you visit the source document and look at how different approaches to random number generation appear when graphed. My point is that the selection of a method and then the decision to seed a method with a particular value can have an impact on how the other numerical recipes behave when random numbers are fed into a process. The outputs of a system in which the user has great confidence may, in fact, be constructs and one way to make sense of data. What’s this mean? Pinpointing algorithmic “bias” is a difficult job. It is often useful to keep in mind that decisions made by a developer or math whiz for what seems like a no brainer process can have a significant impact on outputs.
Stephen E Arnold, August 30, 2018
Mathematical Recipes Revealed: Oh, Oh, Trouble
August 26, 2018
I don’t read the Times Literary Supplement. When I worked in London, I was able to flip through the printed version. In Harrod’s Creek, nope. I did spot a link to an essay with the snappy title “God Is in the Machine.” I took a look.
The write up belongs to the genre of non fiction essays which I call “Yep, that’s all there is.”
The focus is how algorithms work and why some are simple and others are complicated.
Think of the essay as explaining how math works to people who know right off the starting block who Eratosthenes was.
The main point of the first chunk of the write up is that algorithms are recipes, procedures which are implemented one at a time. The input yields an output.
The guts of the argument surface in this passage, attributed to a real algorithm wizard:
The researcher knew, of course, what data he’d fed into the process. He knew why he’d designed it, the problem it was trying to solve and the outputs that it produced. However, after he’d been trying to explain it for over an hour, he sat back in his chair, exhausted. “Yes, as you can see, the gap between input and output is difficult to understand,” he said. He’d flooded the algorithm with a huge amount of information, “a trend”, he said, because in the tech giant he could, and everyone did. But the amount of data meant it was hard to tell what the salient inputs within it were. “From a human perspective you’re not sure which of the inputs is significant; it’s hard to know what is actually driving the outputs. It’s hard to trace back, as a human, to know why a decision was made.”
The complexity emerges when:
- Algorithms are stuck together
- Data (which may or may not be consistent, accurate, or timely) are stuffed into the numerical recipe as “inputs”
- Outputs which may or may not be what the user understands, wants, or can use.
The complexity is manageable if the creator or numerical poets are, what the essay calls, “rigorous.” Is rigor possible in Silicon Valley with professionals who focus on mobile phones, laptops, and lunch options?
Where’s this going?
Not surprisingly, I will have to read a forthcoming book called The Death of the Gods. Like other clarion calls to the use of numerical recipes to do what humans once thought they could do with sufficient education, experience, and judgment, numerical recipes can do—algorithms are the future.
Questions I want toss out when I meet with my research team next week: What if the algorithms are already in charge? Are search results objective? Can you explain why some data are not available from commercial sources? What control do you have over content when ads and “information” are freely mixed?
Perhaps the numerical recipe mechanisms are locked and loaded and firing millions of times a day? What if few hear, know, or understand that the big guns are blazing without sound or a flash? What if people do not care?
Stephen E Arnold, August 26, 2018