9 21

September 20, 2020

One of the DarkCyber research team came across this chart on the Datawrapper Web site. Datawrapper provides millennial-ready analysis tools. With some data and the firm’s software, anyone can produce a chart like this one with green bars for negative numbers.

datawrapper chicago

What is the chart displaying. The odd green bar shows the decline in job postings. Why green? No idea. What is the source of the data? Glassdoor, a job listings site. The data apply only to Chicago, Illinois. The time period is August 2020 versus August 2019. The idea is that the longer the bar, the greater the decline. Why is the bar green? Isn’t red a more suitable color for negative numbers?

Shown in this image are the top 12 sectors for job loss. To be clear, the longer the bar, the fewer job postings. Fewer job postings, one assumes, translates to reduced opportunities for employment.

What’s interesting is that accounting, consulting, information technology, telecommunications, and computer software and hardware are big losers. Those expensive MBAs, the lost hours studying for the CPA examination, and thumb typing through man pages are gone for now.

Observations:

  • The colors? Red maybe.
  • The decline in high technology work and knowledge work is interesting.
  • The “open jobs” numbers are puzzling. Despite declines, Chicago – the city of big shoulders and big challenges – has thousands of jobs in declining sectors.

Net net: IT and computer software and hardware look promising. The chart doesn’t do the opportunities justice. And the color?

Stephen E Arnold, September 20, 2020

Social Science: Like Astrology and Phrenology Perhaps?

September 15, 2020

I do not understand sociology. In 1962, I ended up in a class taught by an esteemed eccentric named Bruce Cameron, Ph.D. I had heard about his interest in short wave and drove past his home to observe the bed springs hanging on the front of his house. The idea, as I recall, was to improve radio reception. Those in the engineering department at the lousy university I attended shared the brilliant professor’s fascination with commercial bed technology at lunch. Even I as a clueless freshman (or is it now freshperson?) knew about the concept of buying an antenna from our local electronics shop.

In the remarkable Dr. Cameron’s Sociology 101 class, he posed the question, “Why do Eskimos wear mittens?” Today, the question would have to reference indigenous circumpolar  people or another appropriate term. But in 1962, Eskimos was the go-to word.

I pointed out that I had seen in the Smithsonian Museum an exhibit of Eskimo hand wear and that there were examples of mittens with a finger component (trigger mits or nord gauntlets), thus combining the warmth of a mitten with the needed dexterity to remove a harpoon from a baby seal.

He ignored my comment. The question turned up on our first examination, and I recycled my alleged learning from the Smithsonian information card for the exhibit.

I received zero credit for my answer. Bummer. I think that was the point at which I dismissed “sociology” and placed it and the good professor in the same pigeon hole I used for astrology and phrenology.

After reading “What’s Wrong with Social Science and How to Fix It: Reflections After Reading 2578 Papers,” I reaffirmed my skepticism of sociology and its allied fields:

But actually diving into the sea of trash that is social science gives you a more tangible perspective, a more visceral revulsion, and perhaps even a sense of Lovecraftian awe at the sheer magnitude of it all: a vast landfill—a great agglomeration of garbage extending as far as the eye can see, effluvious waves crashing and throwing up a foul foam of p=0.049 papers.

The write up contains some interesting data. In reference to a citation graph, the paper points out why references to crappy research persist:

As in all affairs of man, it once again comes down to Hanlon’s Razor. Either:

  1. Malice: they know which results are likely false but cite them anyway.
  2. or, Stupidity: they can’t tell which papers will replicate even though it’s quite easy.

There is another reason: Clubs of so-called experts informally coordinate or simply do the “I will scratch your back if you scratch mine.”

What quasi-sociological field is doing its best to less corrupt? Surprisingly, it is economics. Education seems to have some semblance of ethical behavior, at least based on this sample of papers. But maybe the sample is skewed.

The paper concludes with a list of suggestions. Useful, but I think the present pattern of lousy work is going to persist and increase.

Hang those bed springs on the side of the house. Works for “good enough” solutions.

Stephen E Arnold, September 15, 2020

Expertise Required: Interesting Assertion

September 14, 2020

One of the DarkCyber research team spotted “Lack of Expertise Is the Biggest Barrier for Implementing IoT Solutions.” The surprising assertion comes from Claris, an outfit owned by Apple. Claris (once known as FileMaker Inc.). Clear? Clear as Claris.

The information in the write up presents an interesting assertion about the Internet of Things. An IoT device is a mobile phone or a gizmo that connects to the Internet; for example, an Anduril surveillance drone.

The interesting parts are the actual factual statements; for example:

  • 20 percent of “SMB leaders worry about security and privacy when implementing IoT. Furthermore, they don’t clearly see the return on investment.”
  • 67 percent believe IoT could bring them a competitive advantage and are saying their competitors are “doing more” with IoT at the time.
  • “SMB leaders mentioned improved efficiency, productivity and speed, while about a third see gathering business intelligence as the main driver towards IoT adoption.”
  • About 33 percent say “it’s likely their SMB will launch an IoT initiative within the next three years, while almost half added that their company was lagging behind the competition.”
  • 24 percent) stated their project already yielded ROI, while 38 percent expect it to happen within a year.

Do we know the details of the study, the sample size, the methodology used to select those surveyed, or the statistical validity of the data? Of course not. That is what makes the fact so interesting. That and the need for “expertise.” Perhaps the data were tallied in Filemaker?

Stephen E Arnold, September 14, 2020

Be Smart: Live in Greenness

August 27, 2020

I do not want to be skeptical. I do not want to suggest that a study may need what might be called verification. Please, read “Residential Green Space and Child Intelligence and Behavior across Urban, Suburban, and Rural Areas in Belgium: A Longitudinal Birth Cohort Study of Twins.” To add zip to your learning, navigate to a “real” news outfit’s article called “Children Raised in Greener Areas Have Higher IQ, Study Finds.” Notice any significant differences.

First, the spin in the headline. The PLOS article points out that the sample comes from Belgium. How representative is this country when compared to Peru or Syria? How reliable are “intelligence” assessments? What constitutes bad behavior? Are these “qualities” subject to statistically significant variations due to exogenous factors?

I don’t want to do a line by line comparison of the write up which wants to ring the academic gong. Nor do I want to read how “real” journalists deal with a scholarly article.

I would point out this sentence in the scholarly article:

To our knowledge, this is the first study investigating the association between residential green space and intelligence in children.

Yeah, let’s not get too excited from a sample of 620 in Belgium. Skip school. Play in a park or wander through thick forests.

Stephen E Arnold, August 27, 2020

Bias in Biometrics

August 26, 2020

How can we solve bias in facial recognition and other AI-powered biometric systems? We humans could try to correct for it, but guess where AI learns its biases—yep, from us. Researcher Samira Samadi explored whether using a human evaluator would make an AI less biased or, perhaps, even more so. We learn of her project and others in Biometric Update.com’s article, “Masks Mistaken for Duct Tape, Researchers Experiment to Reduce Human Bias in Biometrics.” Reporter Luana Pascu writes:

“Curious to understand if a human evaluator would make the process fair or more biased, Samadi recruited users for a human-user study. She taught them about facial recognition systems and how to make decisions about system accuracy. ‘We really tried to imitate a real-world scenario, but that actually made it more complicated for the users,’ Samadi said. The experiment confirmed the difficulty in finding an appropriate dataset with ethically sourced images that would not introduce bias into the study. The research was published in a paper called A Human in the Loop is Not Enough: The Need for Human-Subject Experiments in Facial Recognition.”

Many other researchers are studying the bias problem. One NIST report found a lot of software that produced 10-fold to 100-fold increase in the probability of Asian and African American faces being inaccurately recognized (though a few systems had negligible differences). Meanwhile, a team at Wunderman Thompson Data found tools from big players Google, IBM, and Microsoft to be less accurate than they had expected. For one thing, the systems had trouble accounting for masks—still a persistent reality as of this writing. The researchers also found gender bias in all three systems, even though the technologies used are markedly different.

There is reason to hope. Researchers at the Durham University’s Computer Science Department managed to reduce racial bias by one percent and improve ethnicity accuracy. To achieve these results, the team used a synthesized data set with a higher focus on feature identification. We also learn:

“New software to cut down on demographic differences in face biometric performance has also reached the market. The ethnicity-neutral facial recognition API developed by AIH Technology is officially available in the Microsoft Azure Marketplace. In March, the Canadian company joined the Microsoft Partners Network (MPN) and announced the plans for the global launch of its Facial-Recognition-as-a-Service (FRaaS).”

Bias in biometrics, and AI in general, is a thorny problem with no easy solution. At least now people are aware of the issue and bright minds are working to solve it. Now, if only companies would be willing to delay profitable but problematic implementations until solutions are found. Hmmm.

Cynthia Murrell, August 26, 2020

Cloud Data: Clear with Rain Predicted for On Premises Hardware

August 21, 2020

I like surveys which provides some information about sample size. “Survey: How the Pandemic Is Shaking Up the Network Market” says that 2,400 information technology decision makers participated. How were these individuals selected? How was the survey conducted? When was the survey conducted? are questions not answered. Nevertheless, some of the findings seemed interesting.

One of the surprising factoids was the shift from a license for a period of time to a “subscription.” How many outfits are subscribing to cloud services? The write up reports:

The average proportion of IT services consumed via subscription will accelerate by 38% in the next two years, from 34% of the total today to 46% in 2022, and the share of organizations that consume a majority (over 50%) of their IT solutions ‘as a service’ will increase by approximately 72% in that time.

Automatic monthly payments and probably tricky cancellation policies will be part of the subscription business, but that’s just a hunch, not a survey finding.

Other items of interest included these factoids:

77% [of those responding to the survey] said that investments in networking projects had been postponed or delayed since the onset of COVID-19, and 28% indicated that projects had been cancelled altogether.

35% of ITDMs globally are planning to increase their investment in AI-based networking technologies, with the APAC region leading the charge at 44% (including 60% of ITDMs [the acronym which few probably know means “IT decision-makers”]  in India and 54% in Hong Kong).

just 8% [of the sample] globally plan to continue with only CapEx investments.

Net net: Pricing and curtailing capital expenditures may be trends. If these data are accurate, the data suggest that companies targeting on premises sales of hardware may face some headwinds. Of course, I believe everything I read on the Internet, particularly objective surveys.

Stephen E Arnold, August 21, 2020

True or False: AI Algorithms Are Neutral Little Puppies

August 11, 2020

The answer, according to CanIndia News, is false. (I think some people believe this.) “Google IBM, Microsoft AI Models Fail to Curb Gender Bias” reports:

new research has claimed that Google AI datasets identified most women wearing masks as if their mouths were covered by duct tapes. Not just Google. When put to work, artificial intelligence-powered IBM Watson virtual assistant was not far behind on gender bias. In 23 per cent of cases, Watson saw a woman wearing a gag while in another 23 per cent, it was sure the woman was “wearing a restraint or chains”.

Before warming up the tar and chasing geese for feathers, you may want to note that the sample was 265 men and 265 females. Note: The subjects were wearing covid masks or personal protective equipment.

Out of the 265 images of men in masks, Google correctly identified 36 per cent as containing PPE. It also mistook 27 per cent of images as depicting facial hair.

The researchers learned that 15 per cent of images were misclassified as duct tape.

The write up highlights this finding:

Overall, for 40 per cent of images of women, Microsoft Azure Cognitive Services identified the mask as a fashion accessory compared to only 13 per cent of images of men.

Surprised? DarkCyber is curious about:

  1. Sample size. DarkCyber’s recollection is that the sample should have been in the neighborhood of 2,000 or so with 1,000 possible women and 1,000 possible men
  2. Training. How were the models trained. Were “masks” represented in the training set? What percentage of training images had masks?
  3. Image quality. What steps were taken to ensure that the “images” were of consistent quality; that is, focus, resolution, color, etc.

DarkCyber is interested in the “bias” allegation. But DarkCyber may be biased with regard to studies which make it possible to question sample size, training, and data quality/consistency. The models may have flaws, but the bias thing? Maybe, maybe not.

Stephen E Arnold, August 11, 2020

Need Global Financial Data? One Somewhat Useful Site

July 21, 2020

If you need a financial number, you may not have to dig through irrelevant free Web search results or use your Bloomberg terminal to find an “aggregate” function for the category in which you have an interest. Yippy.

Navigate to “All of the World’s Money and Markets in One Visualization.” As you know, my skepticism filter blinks when I encounter the logical Taser “all.” I also like to know where, when, how, and why certain data are obtained. The mechanism for normalizing the data is important to me as well. Well, forget most of those questions.

Look at the Web page. Pick a category. Boom. You have your number.

Accurate? Timely? Verifiable?

Not on the site. But in a “good enough” era of Zoom meetings, a number is available. Just in a picture.

Stephen E Arnold, July 21, 2020

Data Visualizations: An Opportunity Converted into a Border Wall

May 18, 2020

I read “Understanding Uncertainty: Visualizing Probabilities.” The information in the article is useful. Helpful examples make clear how easy it is to create a helpful representation of certain statistical data.

The opportunity today is to make representations of numeric data, probabilities, and “uncertainty” more easily understandable.

The barrier is that “good enough” visualizations can be output with the click of a mouse. The graphic may be attractive, but it may distort the information allegedly presented in a helpful way.

But appearance may be more important than substance. Need examples. Check out the Covid19 “charts”. Most of these are confusing and ignore important items of information.

Good enough is not good enough.

Stephen E Arnold, May 18, 2020

Bayesian Math: Useful Book Is Free for Personal Use

May 11, 2020

The third edition of Bayesian Data Analysis (updated on February 13, 2020) is available at this link. The authors are Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. With the Bayes’ principles in hand, making sense of some of the modern smart systems becomes somewhat easier. The book covers the basics and advanced computation. One of the more interesting sections is Part V: Nonlinear and Nonparametric Models. You may want to add this to your library.

Stephen E Arnold, May11, 2020

Next Page »

  • Archives

  • Recent Posts

  • Meta