Digital 2021: Lots of Numbers

April 23, 2021

One of the Beyond Search team called my attention to the We Are Social / Hootsuite “Digital 2021 April Global Statshot Report.” The original link did not resolve. After a bit of clicking around, we did locate the presentation on the outstanding SlideShare service. No, the SlideShare search function did not work for us, but we know that it will return to its glory soon. Maybe real soon perhaps?

The report with the numbers is located at this link. If that doesn’t work, there is an index located at this link. If these go dead, you can give the We Are Social / Hootsuite explainer at this Datareportal link.

After that bit of housekeeping, what is the “Digital 2021 April Global Statshot Report”? The answer is that it is:

All the latest stats, insights, and trends you need to make sense of how the world uses the internet, mobile, social media, and ecommerce in April 2021. For more reports, including the latest global trends and in-depth local data for more than 240 countries and territories around the world, visit https://datareportal.com

As readers of this blog have heard, “all” is a trigger word. I want to know how many Dark Web encrypted message services are operated by state actors, not addled college students. Did I find the answer? Nope. So  the “all” is baloney.

The report does provide assorted disclaimers and numerous big numbers; for example, 55.1 percent of 7,850,000,000 people are active social media users. Pretty darned exact. When I was on a trip to Wuhan, China, I was told by our government provided guide, “No one is sure how many people live in Wuhan. There are different methods of counting.” If China can’t deal with counting, I am curious how precise numbers are generated for a global report. Eastern Asia (possibly China?) accounts for 25.1 percent of global Internet users by region. Probably doesn’t matter in the context of a 200 page report in PowerPoint format.

Other findings which jumped out at me as I flipped through the deck which has taken its inspiration from Mary Meeker’s Internet Trends Report last seen in 2019.

  • Mobile users are 92.8 percent of the total number of Internet users and mobile phones account for 54.18 percent of Web traffic
  • The zippiest Internet is located in the UAE
  • Google’s search market share is 92.4 percent. Qwant, which allegedly caused Eric Schmidt to lose sleep, does not appear in the search engine market share table
  • 98 percent of Internet users visit or use social networks
  • TikTok is the 7th most used social platform but the data come from TikTok, an outfit which is probably the gold standard in reliable information.

The reportal document does not explain what these data mean.

Here’s my take: The data provide many numbers which make clear three points:

  1. Mobile is a big deal
  2. Facebook and Google are bigger deals
  3. Criminal activity within these data ecosystems warrants zero attention.

The reportal’s data are free too.

Stephen E Arnold, April 23, 2021

Artificial Intelligence: Maybe These Numbers Are Artificial?

February 25, 2021

AI this. AI that. Suddenly it’s spring time for algorithmic magic. I read “Worldwide Revenues for AI Skyrocket, Set to Reach $550B by 2024.” That’s an interesting projection. What is “artificial intelligence?” No one has a precise definition. That makes it possible to assert that in 22 months, smart software will be more than half way to a trillion dollar market. That will make the MBA proteins kick into overdrive.

The write up cites the estimable mid tier consulting firm IDC and its Worldwide Semiannual Artificial Intelligence Tracker. I believe that this may be similar to the PC Magazine editorial team sitting around a lunch table generating lists of hot products and numbers about the uptake of windows 95. There is nothing wrong with projections. And estimates which aim toward a trillion dollar market are energizing in the Age of Rona.

The write up reports that IDC calculated with near infinite precision these outputs:

“the artificial intelligence (AI) market, including software, hardware, and services, are forecast to grow 16.4% year over year in 2021 to $327.5 billion… By 2024, the market is expected to break the $500 billion mark with a five-year compound annual growth rate (CAGR) of 17.5% and total revenues reaching $554.3 billion.”

Other findings (aside from the stretchy bendable fuzzy definition of “artificial intelligence” as including software, hardware, and services:

  • “Software represented 88% of the total AI market revenues in 2020. However, it is the slowest growing category with a five-year CAGR of 17.3%.”
  • “AI Applications took the largest share of revenue at 50% in 2020.”
  • “The AI Services category grew slower than the overall AI market with 13% annual revenue growth in 2020.”
  • “By 2024, AI Hardware is forecast to be a $30.5 billion market with AI Servers representing an 82% revenue share.”

Is AI a sandbox in which anyone can play? The data allegedly reveal:

In the Business Services for AI market, there were only four companies, Ernst & Young, PwC, Deloitte, and Booz Allen Hamilton, that generated revenues of more than $100 million in 1H 2020.

Okay, okay. Let’s step back:

  1. The definition of AI is nebulous which means that the assumptions are not exactly as solid as those of the new leaning Tower of Pisa in San Francisco
  2. The fuzzing of revenue streams, hardware, software, and the mushroom of services is confusing at least to me
  3. AI appears to be another of those one percenter sectors.

Net net: AI will use you whether you are ready or not or whether the systems work or not. We could ask IBM Watson but IBM is allegedly trying to sell its fantastic health care AI business. Googlers are busy revealing the flaws in some Googley assumptions about its AI capabilities. Nevertheless, we have big numbers.

VC, consultants, and MBAs, get ready to bill. By the way, these estimates seem similar to those issued by the estimable mid tier consulting firm for the cognitive search market. Not exactly a hole in one as I recall.

Stephen E Arnold, February 25, 2021

Music Research: Bach, Mozart, and Vivaldi Are Losers

February 18, 2021

Here’s a statement from “Techno Is the Genre Least Effective at Reducing Anxiety.” The statement is simple:

techno, dubstep and 70’s rock anthems the top three types of music that recorded an increase in their blood pressure.

Now read this statement:

Techno, dubstep and classical chill out were also the top three genres to increase heart rates among the volunteers.

Let’s try to figure this out:

  • Dubstep appears in each list
  • Techno appears in each list
  • 70s rock anthems appears in one list
  • Classical chill out appears in one list.

It seems that listening to any one of these types of music will pump up the heart rate and increase blood pressure. But no! Only 70s rock anthems and classical chill out increase the heart rate without affecting blood pressure.

What do the data say?

“The study was conducted by the Vera Clinic, who also drafted in Doctor Ömer Avlanm?? to review the results. Medically they make a lot of sense…”

Sure they do. Bach and Mozart are losers. The music should do more than just raise blood pressure. Is there a Chopin rave happening on Zoom soon? Yep, thumb typing research for the GenXers and Millennials. Sample selection methodology? Confidence? Analytic methods? Term definition? Ho ho ho.

Stephen E Arnold, February 18, 2021

Crazy Research for the Work from Home Crowd

December 16, 2020

I read — despite my inner voice shouting, no, no, no — “Australian Study Shows Working in Pajamas Does Mot Hurt Productivity.” One summer session in graduate school, I had a roomie who slept without anything. Nifty, particularly when I had to observe this person sitting at the desk in the dorm before heading to class. Yeah, disgusting then and the memory is disgusting now.

The write up states:

When the study examined the effects wearing pajamas had on productivity and mental health, it found that wearing pajamas was associated with more frequent reporting of poorer mental health. For 59% of participants who wore pajamas during the day at least one day a week, they admitted their mental health declined while working from home, versus 26% of participants who did not wear pajamas while working from home.

The headline sort of misses the point.

But one of the flaws in the study is that the question, “Do you wear clothing when you sleep?” seems to have been ignored by the journalist and maybe the researchers in Sydney.

Key point: Pretty silly stuff. I want to know what percentage of the sample slept naked and then arose to work in a productive manner with a good mental attitude. Then I want to know that if a partner were present for the naked WFHers, what is the impact of this behavior on anyone able to look at this nude person perched in an Aeron with a laptop scrunched on their chest.

Got the picture?

Stephen E Arnold, December 16, 2020

Want to Manipulate Humans? Try These Hot Buttons

December 3, 2020

Okay, thumb typing marketers, insights from academia. Navigate to “We Are All Behavioral, More or Less: A Taxonomy of Consumer Decision Making.” The write up is available from Dartmouth, home of behavioral economists and psychologists and okay pizza.

The write up is 70 pages in length and chock full of jargon and academic thinking. Nevertheless, the author, one Victor Stango, reveals some suggestive information.

Here are a couple of examples:

Table 3. Correlations among behavioral biases, and between biases and other decision inputs offers insight into pairings of bias factors

Table 5. Rotated 8-factor models and loadings of decision inputs on common factors provides a “look up table” with values to help guide a sales pitch

The list of hot button factors includes:

  • Present bias
  • Choice type
  • Risk biases
  • Confidence
  • Math bias
  • Attention
  • Patience vs. risk aversion
  • Cognitive skills
  • Personality

Net net: Manipulate biases by combining factors. Launch those online marketing campaigns via social media with confidence, p-value lovers.

Stephen E Arnold, December 2, 2020

Surveys: These Marketing Devices Are Accurate, Right?

November 10, 2020

There’s nothing like a sample, a statistical sample, that is. What’s interesting is that the US polls seem to have been reflecting some interesting but marketing-type trends. The bastion of “real journalism”— the UK Daily Mail — published “…We Did a Good Job: Defiant Pollster Nate Silver Rushes to Defend His Profession after Another Systematic Failure of Polls in the Build-Up to an Election.” Bibliophiles will note that I have omitted the tasteful obscenity. I like to avoid using words likely to irritate the really smart software which edits blog posts.

The write up points out:

FiveThirtyEight founder and editor-in-chief Nate Silver hit back at those slamming the website for being so off with their election predictions.

Let’s think about why FiveThirtyEight and other polls seem to have predicted a reality different from the one generated by humanoids marking ballots.

First, there is the sample. Picking people at random is dependent on a number of factors: Sources, selection bias, humanoids who don’t respond, etc.

Second, there are the humanoids themselves. Some people plug in the “answers” which get the poll over with really fast. I lose interest at the first hint of dark patterns which make it tough to know how may questions I have to answer to get the coupon, pat on the head, or the free shopping sack.

Third, there is counting. Yep, humans or machine things can happen.

Fourth, there is analysis. It is remarkable what one can do when counting or doing “analytics.”

The Daily Mail quotes an expert about making polls better:

‘The polling profession needs to reshape and reorganize their questionnaires,’ Luntz [the polling expert] told DailyMail.com. ‘It’s the only way they’ll ever get it right.’

But I keep thinking about the FiveThirtyEight obscenity. Defensive? Eloquent? Subjective? Insightful?

That subjective thing.

Stephen E Arnold, November 10, 2020

Spreadsheet Fever Case Example

October 12, 2020

I have been using the phrase “spreadsheet fever” to describe the impact of fiddling with numbers in Microsoft Excel has on MBAs. With Excel providing the backbone for numerous statistical confections, the sugar hit of magic assumptions cannot be under-estimated. The mental structure of a crazed investment analyst brooks no interference from common sense.

Excel: Why Using Microsoft’s Tool Caused Covid-19 Results to Be Lost” provides a possible case example of what happens when thumbtypers and over-confident innumerates tangle with a digital spreadsheet. No green eyeshades and no pencils needed. Calculators? One can hear a 22 year old ask, “What’s a calculator? I have one on my iPhone?”

The Beeb reports:

PHE [Public Health England, a fine UK entity] had set up an automatic process to pull this data together into Excel templates so that it could then be uploaded to a central system and made available to the NHS Test and Trace team, as well as other government computer dashboards.

And what tool did these over confident wizards use?

Microsoft Excel, the weapon of choice for business and STEM analysis, of course.

How did the experts wander off the information highway into a thicket of errors? The Beeb explains:

The problem is that PHE’s own developers picked an old file format to do this – known as XLS. As a consequence, each template could handle only about 65,000 rows of data rather than the one million-plus rows that Excel is actually capable of. And since each test result created several rows of data, in practice it meant that each template was limited to about 1,400 cases. When that total was reached, further cases were simply left off.

The fix? Can kicking perhaps:

But insiders acknowledge that the current clunky system needs to be replaced by something more advanced that excludes Excel, as soon as possible.

Righto.

Stephen E Arnold, October 12, 2020

Math Cheat Sheets

October 9, 2020

Since we live in a statistics charged world, there is a strong likelihood that you may want to brush up on math. Fear not. A collection of math cheat sheets are available without charge. “Probability Cheat Sheet – Harvard University” includes some links (good and bad). What does a cheat sheet from Harvard University with its modest endowment and measly seven percent return so far this year look like? Here’s a sample from the probability document:

image

Stephen E Arnold, October 9, 2020

9 21

September 20, 2020

One of the DarkCyber research team came across this chart on the Datawrapper Web site. Datawrapper provides millennial-ready analysis tools. With some data and the firm’s software, anyone can produce a chart like this one with green bars for negative numbers.

datawrapper chicago

What is the chart displaying. The odd green bar shows the decline in job postings. Why green? No idea. What is the source of the data? Glassdoor, a job listings site. The data apply only to Chicago, Illinois. The time period is August 2020 versus August 2019. The idea is that the longer the bar, the greater the decline. Why is the bar green? Isn’t red a more suitable color for negative numbers?

Shown in this image are the top 12 sectors for job loss. To be clear, the longer the bar, the fewer job postings. Fewer job postings, one assumes, translates to reduced opportunities for employment.

What’s interesting is that accounting, consulting, information technology, telecommunications, and computer software and hardware are big losers. Those expensive MBAs, the lost hours studying for the CPA examination, and thumb typing through man pages are gone for now.

Observations:

  • The colors? Red maybe.
  • The decline in high technology work and knowledge work is interesting.
  • The “open jobs” numbers are puzzling. Despite declines, Chicago – the city of big shoulders and big challenges – has thousands of jobs in declining sectors.

Net net: IT and computer software and hardware look promising. The chart doesn’t do the opportunities justice. And the color?

Stephen E Arnold, September 20, 2020

Social Science: Like Astrology and Phrenology Perhaps?

September 15, 2020

I do not understand sociology. In 1962, I ended up in a class taught by an esteemed eccentric named Bruce Cameron, Ph.D. I had heard about his interest in short wave and drove past his home to observe the bed springs hanging on the front of his house. The idea, as I recall, was to improve radio reception. Those in the engineering department at the lousy university I attended shared the brilliant professor’s fascination with commercial bed technology at lunch. Even I as a clueless freshman (or is it now freshperson?) knew about the concept of buying an antenna from our local electronics shop.

In the remarkable Dr. Cameron’s Sociology 101 class, he posed the question, “Why do Eskimos wear mittens?” Today, the question would have to reference indigenous circumpolar  people or another appropriate term. But in 1962, Eskimos was the go-to word.

I pointed out that I had seen in the Smithsonian Museum an exhibit of Eskimo hand wear and that there were examples of mittens with a finger component (trigger mits or nord gauntlets), thus combining the warmth of a mitten with the needed dexterity to remove a harpoon from a baby seal.

He ignored my comment. The question turned up on our first examination, and I recycled my alleged learning from the Smithsonian information card for the exhibit.

I received zero credit for my answer. Bummer. I think that was the point at which I dismissed “sociology” and placed it and the good professor in the same pigeon hole I used for astrology and phrenology.

After reading “What’s Wrong with Social Science and How to Fix It: Reflections After Reading 2578 Papers,” I reaffirmed my skepticism of sociology and its allied fields:

But actually diving into the sea of trash that is social science gives you a more tangible perspective, a more visceral revulsion, and perhaps even a sense of Lovecraftian awe at the sheer magnitude of it all: a vast landfill—a great agglomeration of garbage extending as far as the eye can see, effluvious waves crashing and throwing up a foul foam of p=0.049 papers.

The write up contains some interesting data. In reference to a citation graph, the paper points out why references to crappy research persist:

As in all affairs of man, it once again comes down to Hanlon’s Razor. Either:

  1. Malice: they know which results are likely false but cite them anyway.
  2. or, Stupidity: they can’t tell which papers will replicate even though it’s quite easy.

There is another reason: Clubs of so-called experts informally coordinate or simply do the “I will scratch your back if you scratch mine.”

What quasi-sociological field is doing its best to less corrupt? Surprisingly, it is economics. Education seems to have some semblance of ethical behavior, at least based on this sample of papers. But maybe the sample is skewed.

The paper concludes with a list of suggestions. Useful, but I think the present pattern of lousy work is going to persist and increase.

Hang those bed springs on the side of the house. Works for “good enough” solutions.

Stephen E Arnold, September 15, 2020

Next Page »

  • Archives

  • Recent Posts

  • Meta