Spreadsheet Fever Case Example

October 12, 2020

I have been using the phrase “spreadsheet fever” to describe the impact of fiddling with numbers in Microsoft Excel has on MBAs. With Excel providing the backbone for numerous statistical confections, the sugar hit of magic assumptions cannot be under-estimated. The mental structure of a crazed investment analyst brooks no interference from common sense.

Excel: Why Using Microsoft’s Tool Caused Covid-19 Results to Be Lost” provides a possible case example of what happens when thumbtypers and over-confident innumerates tangle with a digital spreadsheet. No green eyeshades and no pencils needed. Calculators? One can hear a 22 year old ask, “What’s a calculator? I have one on my iPhone?”

The Beeb reports:

PHE [Public Health England, a fine UK entity] had set up an automatic process to pull this data together into Excel templates so that it could then be uploaded to a central system and made available to the NHS Test and Trace team, as well as other government computer dashboards.

And what tool did these over confident wizards use?

Microsoft Excel, the weapon of choice for business and STEM analysis, of course.

How did the experts wander off the information highway into a thicket of errors? The Beeb explains:

The problem is that PHE’s own developers picked an old file format to do this – known as XLS. As a consequence, each template could handle only about 65,000 rows of data rather than the one million-plus rows that Excel is actually capable of. And since each test result created several rows of data, in practice it meant that each template was limited to about 1,400 cases. When that total was reached, further cases were simply left off.

The fix? Can kicking perhaps:

But insiders acknowledge that the current clunky system needs to be replaced by something more advanced that excludes Excel, as soon as possible.

Righto.

Stephen E Arnold, October 12, 2020

Math Cheat Sheets

October 9, 2020

Since we live in a statistics charged world, there is a strong likelihood that you may want to brush up on math. Fear not. A collection of math cheat sheets are available without charge. “Probability Cheat Sheet – Harvard University” includes some links (good and bad). What does a cheat sheet from Harvard University with its modest endowment and measly seven percent return so far this year look like? Here’s a sample from the probability document:

image

Stephen E Arnold, October 9, 2020

9 21

September 20, 2020

One of the DarkCyber research team came across this chart on the Datawrapper Web site. Datawrapper provides millennial-ready analysis tools. With some data and the firm’s software, anyone can produce a chart like this one with green bars for negative numbers.

datawrapper chicago

What is the chart displaying. The odd green bar shows the decline in job postings. Why green? No idea. What is the source of the data? Glassdoor, a job listings site. The data apply only to Chicago, Illinois. The time period is August 2020 versus August 2019. The idea is that the longer the bar, the greater the decline. Why is the bar green? Isn’t red a more suitable color for negative numbers?

Shown in this image are the top 12 sectors for job loss. To be clear, the longer the bar, the fewer job postings. Fewer job postings, one assumes, translates to reduced opportunities for employment.

What’s interesting is that accounting, consulting, information technology, telecommunications, and computer software and hardware are big losers. Those expensive MBAs, the lost hours studying for the CPA examination, and thumb typing through man pages are gone for now.

Observations:

  • The colors? Red maybe.
  • The decline in high technology work and knowledge work is interesting.
  • The “open jobs” numbers are puzzling. Despite declines, Chicago – the city of big shoulders and big challenges – has thousands of jobs in declining sectors.

Net net: IT and computer software and hardware look promising. The chart doesn’t do the opportunities justice. And the color?

Stephen E Arnold, September 20, 2020

Social Science: Like Astrology and Phrenology Perhaps?

September 15, 2020

I do not understand sociology. In 1962, I ended up in a class taught by an esteemed eccentric named Bruce Cameron, Ph.D. I had heard about his interest in short wave and drove past his home to observe the bed springs hanging on the front of his house. The idea, as I recall, was to improve radio reception. Those in the engineering department at the lousy university I attended shared the brilliant professor’s fascination with commercial bed technology at lunch. Even I as a clueless freshman (or is it now freshperson?) knew about the concept of buying an antenna from our local electronics shop.

In the remarkable Dr. Cameron’s Sociology 101 class, he posed the question, “Why do Eskimos wear mittens?” Today, the question would have to reference indigenous circumpolar  people or another appropriate term. But in 1962, Eskimos was the go-to word.

I pointed out that I had seen in the Smithsonian Museum an exhibit of Eskimo hand wear and that there were examples of mittens with a finger component (trigger mits or nord gauntlets), thus combining the warmth of a mitten with the needed dexterity to remove a harpoon from a baby seal.

He ignored my comment. The question turned up on our first examination, and I recycled my alleged learning from the Smithsonian information card for the exhibit.

I received zero credit for my answer. Bummer. I think that was the point at which I dismissed “sociology” and placed it and the good professor in the same pigeon hole I used for astrology and phrenology.

After reading “What’s Wrong with Social Science and How to Fix It: Reflections After Reading 2578 Papers,” I reaffirmed my skepticism of sociology and its allied fields:

But actually diving into the sea of trash that is social science gives you a more tangible perspective, a more visceral revulsion, and perhaps even a sense of Lovecraftian awe at the sheer magnitude of it all: a vast landfill—a great agglomeration of garbage extending as far as the eye can see, effluvious waves crashing and throwing up a foul foam of p=0.049 papers.

The write up contains some interesting data. In reference to a citation graph, the paper points out why references to crappy research persist:

As in all affairs of man, it once again comes down to Hanlon’s Razor. Either:

  1. Malice: they know which results are likely false but cite them anyway.
  2. or, Stupidity: they can’t tell which papers will replicate even though it’s quite easy.

There is another reason: Clubs of so-called experts informally coordinate or simply do the “I will scratch your back if you scratch mine.”

What quasi-sociological field is doing its best to less corrupt? Surprisingly, it is economics. Education seems to have some semblance of ethical behavior, at least based on this sample of papers. But maybe the sample is skewed.

The paper concludes with a list of suggestions. Useful, but I think the present pattern of lousy work is going to persist and increase.

Hang those bed springs on the side of the house. Works for “good enough” solutions.

Stephen E Arnold, September 15, 2020

Expertise Required: Interesting Assertion

September 14, 2020

One of the DarkCyber research team spotted “Lack of Expertise Is the Biggest Barrier for Implementing IoT Solutions.” The surprising assertion comes from Claris, an outfit owned by Apple. Claris (once known as FileMaker Inc.). Clear? Clear as Claris.

The information in the write up presents an interesting assertion about the Internet of Things. An IoT device is a mobile phone or a gizmo that connects to the Internet; for example, an Anduril surveillance drone.

The interesting parts are the actual factual statements; for example:

  • 20 percent of “SMB leaders worry about security and privacy when implementing IoT. Furthermore, they don’t clearly see the return on investment.”
  • 67 percent believe IoT could bring them a competitive advantage and are saying their competitors are “doing more” with IoT at the time.
  • “SMB leaders mentioned improved efficiency, productivity and speed, while about a third see gathering business intelligence as the main driver towards IoT adoption.”
  • About 33 percent say “it’s likely their SMB will launch an IoT initiative within the next three years, while almost half added that their company was lagging behind the competition.”
  • 24 percent) stated their project already yielded ROI, while 38 percent expect it to happen within a year.

Do we know the details of the study, the sample size, the methodology used to select those surveyed, or the statistical validity of the data? Of course not. That is what makes the fact so interesting. That and the need for “expertise.” Perhaps the data were tallied in Filemaker?

Stephen E Arnold, September 14, 2020

Be Smart: Live in Greenness

August 27, 2020

I do not want to be skeptical. I do not want to suggest that a study may need what might be called verification. Please, read “Residential Green Space and Child Intelligence and Behavior across Urban, Suburban, and Rural Areas in Belgium: A Longitudinal Birth Cohort Study of Twins.” To add zip to your learning, navigate to a “real” news outfit’s article called “Children Raised in Greener Areas Have Higher IQ, Study Finds.” Notice any significant differences.

First, the spin in the headline. The PLOS article points out that the sample comes from Belgium. How representative is this country when compared to Peru or Syria? How reliable are “intelligence” assessments? What constitutes bad behavior? Are these “qualities” subject to statistically significant variations due to exogenous factors?

I don’t want to do a line by line comparison of the write up which wants to ring the academic gong. Nor do I want to read how “real” journalists deal with a scholarly article.

I would point out this sentence in the scholarly article:

To our knowledge, this is the first study investigating the association between residential green space and intelligence in children.

Yeah, let’s not get too excited from a sample of 620 in Belgium. Skip school. Play in a park or wander through thick forests.

Stephen E Arnold, August 27, 2020

Bias in Biometrics

August 26, 2020

How can we solve bias in facial recognition and other AI-powered biometric systems? We humans could try to correct for it, but guess where AI learns its biases—yep, from us. Researcher Samira Samadi explored whether using a human evaluator would make an AI less biased or, perhaps, even more so. We learn of her project and others in Biometric Update.com’s article, “Masks Mistaken for Duct Tape, Researchers Experiment to Reduce Human Bias in Biometrics.” Reporter Luana Pascu writes:

“Curious to understand if a human evaluator would make the process fair or more biased, Samadi recruited users for a human-user study. She taught them about facial recognition systems and how to make decisions about system accuracy. ‘We really tried to imitate a real-world scenario, but that actually made it more complicated for the users,’ Samadi said. The experiment confirmed the difficulty in finding an appropriate dataset with ethically sourced images that would not introduce bias into the study. The research was published in a paper called A Human in the Loop is Not Enough: The Need for Human-Subject Experiments in Facial Recognition.”

Many other researchers are studying the bias problem. One NIST report found a lot of software that produced 10-fold to 100-fold increase in the probability of Asian and African American faces being inaccurately recognized (though a few systems had negligible differences). Meanwhile, a team at Wunderman Thompson Data found tools from big players Google, IBM, and Microsoft to be less accurate than they had expected. For one thing, the systems had trouble accounting for masks—still a persistent reality as of this writing. The researchers also found gender bias in all three systems, even though the technologies used are markedly different.

There is reason to hope. Researchers at the Durham University’s Computer Science Department managed to reduce racial bias by one percent and improve ethnicity accuracy. To achieve these results, the team used a synthesized data set with a higher focus on feature identification. We also learn:

“New software to cut down on demographic differences in face biometric performance has also reached the market. The ethnicity-neutral facial recognition API developed by AIH Technology is officially available in the Microsoft Azure Marketplace. In March, the Canadian company joined the Microsoft Partners Network (MPN) and announced the plans for the global launch of its Facial-Recognition-as-a-Service (FRaaS).”

Bias in biometrics, and AI in general, is a thorny problem with no easy solution. At least now people are aware of the issue and bright minds are working to solve it. Now, if only companies would be willing to delay profitable but problematic implementations until solutions are found. Hmmm.

Cynthia Murrell, August 26, 2020

Cloud Data: Clear with Rain Predicted for On Premises Hardware

August 21, 2020

I like surveys which provides some information about sample size. “Survey: How the Pandemic Is Shaking Up the Network Market” says that 2,400 information technology decision makers participated. How were these individuals selected? How was the survey conducted? When was the survey conducted? are questions not answered. Nevertheless, some of the findings seemed interesting.

One of the surprising factoids was the shift from a license for a period of time to a “subscription.” How many outfits are subscribing to cloud services? The write up reports:

The average proportion of IT services consumed via subscription will accelerate by 38% in the next two years, from 34% of the total today to 46% in 2022, and the share of organizations that consume a majority (over 50%) of their IT solutions ‘as a service’ will increase by approximately 72% in that time.

Automatic monthly payments and probably tricky cancellation policies will be part of the subscription business, but that’s just a hunch, not a survey finding.

Other items of interest included these factoids:

77% [of those responding to the survey] said that investments in networking projects had been postponed or delayed since the onset of COVID-19, and 28% indicated that projects had been cancelled altogether.

35% of ITDMs globally are planning to increase their investment in AI-based networking technologies, with the APAC region leading the charge at 44% (including 60% of ITDMs [the acronym which few probably know means “IT decision-makers”]  in India and 54% in Hong Kong).

just 8% [of the sample] globally plan to continue with only CapEx investments.

Net net: Pricing and curtailing capital expenditures may be trends. If these data are accurate, the data suggest that companies targeting on premises sales of hardware may face some headwinds. Of course, I believe everything I read on the Internet, particularly objective surveys.

Stephen E Arnold, August 21, 2020

True or False: AI Algorithms Are Neutral Little Puppies

August 11, 2020

The answer, according to CanIndia News, is false. (I think some people believe this.) “Google IBM, Microsoft AI Models Fail to Curb Gender Bias” reports:

new research has claimed that Google AI datasets identified most women wearing masks as if their mouths were covered by duct tapes. Not just Google. When put to work, artificial intelligence-powered IBM Watson virtual assistant was not far behind on gender bias. In 23 per cent of cases, Watson saw a woman wearing a gag while in another 23 per cent, it was sure the woman was “wearing a restraint or chains”.

Before warming up the tar and chasing geese for feathers, you may want to note that the sample was 265 men and 265 females. Note: The subjects were wearing covid masks or personal protective equipment.

Out of the 265 images of men in masks, Google correctly identified 36 per cent as containing PPE. It also mistook 27 per cent of images as depicting facial hair.

The researchers learned that 15 per cent of images were misclassified as duct tape.

The write up highlights this finding:

Overall, for 40 per cent of images of women, Microsoft Azure Cognitive Services identified the mask as a fashion accessory compared to only 13 per cent of images of men.

Surprised? DarkCyber is curious about:

  1. Sample size. DarkCyber’s recollection is that the sample should have been in the neighborhood of 2,000 or so with 1,000 possible women and 1,000 possible men
  2. Training. How were the models trained. Were “masks” represented in the training set? What percentage of training images had masks?
  3. Image quality. What steps were taken to ensure that the “images” were of consistent quality; that is, focus, resolution, color, etc.

DarkCyber is interested in the “bias” allegation. But DarkCyber may be biased with regard to studies which make it possible to question sample size, training, and data quality/consistency. The models may have flaws, but the bias thing? Maybe, maybe not.

Stephen E Arnold, August 11, 2020

Need Global Financial Data? One Somewhat Useful Site

July 21, 2020

If you need a financial number, you may not have to dig through irrelevant free Web search results or use your Bloomberg terminal to find an “aggregate” function for the category in which you have an interest. Yippy.

Navigate to “All of the World’s Money and Markets in One Visualization.” As you know, my skepticism filter blinks when I encounter the logical Taser “all.” I also like to know where, when, how, and why certain data are obtained. The mechanism for normalizing the data is important to me as well. Well, forget most of those questions.

Look at the Web page. Pick a category. Boom. You have your number.

Accurate? Timely? Verifiable?

Not on the site. But in a “good enough” era of Zoom meetings, a number is available. Just in a picture.

Stephen E Arnold, July 21, 2020

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta