Quantitative vs Qualitative Data, Defined
January 4, 2022
Sounding almost philosophical, The Future of Things posts, “What is Data? Types of Data, Explained.” Distinguishing between types of data can mean many things. One distinction we are always curious about is what data does one emit via mobile devices and what types are feeding the surveillance machines? This write-up, though, is more of a primer on a basic data science concept: the difference between quantitative and qualitative data. Writer Kris defines quantitative data:
“As the name suggests, quantitative data can be quantified — that is, it can be measured and expressed in numerical values. Thus, it is easy to manipulate quantitative data and represent it through statistical graphs and charts. Quantitative data usually answers questions like ‘How much?’, ‘How many?’ and ‘How often?’ Some examples of quantitative data include a person’s height, the amount of time they spend on social media and their annual income. There are two key types of quantitative data: discrete and continuous.”
Here is the difference between those types of quantitative data: Discrete data cannot be divided into parts smaller than a whole number, like customers (yikes!) or orders. Continuous data is measured on a scale and can include fractions; the height or weight of a product, for example.
Kris goes on to define quantitative data, which is harder to process and analyze but can provide insights that quantitative data simply cannot:
“Qualitative data … exists as words, images, objects and symbols. Sometimes qualitative data is called categorical data because the information must be sorted into categories instead of represented by numbers or statistical charts. Qualitative data tends to answer questions with more nuance, like ‘Why did this happen?’ Some examples of qualitative data business leaders might encounter include customers’ names, their favorite colors and their ethnicity. The two most common types of qualitative data are nominal data and ordinal data.”
As the names suggest: Nominal data names a piece of data without assigning it any order in relation to other pieces of data. Ordinal data ranks bits of information in an order of some kind. The difference is important when performing any type of statistical analysis.
Cynthia Murrell, January 4, 2022
Google: Who Us? Oh, We Are Sorry
January 4, 2022
One sign of a decent human being is when they admit their mistakes and accept their responsibilities. When people accept their mistakes, the situation blows over quicker. Google leaders, however, are reluctant to accept the consequences of their poor actions and it is not generating good PR. The lo shares the story in: “Google CEO Blames Employee Leaks To The Press For Reduced ‘Trust and Candor’ At The Company.”
During a recent end-of-the year meeting, Google employees could submit questions via the internal company system Dory. They then can vote on the questions they wish management to answer. The following questioned received 673 votes:
“The question was: ‘It seems like responses to Dory have gotten increasingly more lawyer-like with canned phrases or platitudes, which seem to ignore the questions being ask [sic]. Are we planning on bringing candor, honesty, humility and frankness back to Dory answers or continuing down a bureaucratic path?’”
Google CEO Sundar Pichai was exasperated and he blamed employees leaking information to the media for the inflated, artificial answers. He said the following:
“‘Sometimes, I do think that people are unforgiving for small mistakes. I do think people realize that answers can be quoted anywhere, including outside the company. I think that makes people very careful,’ he said. ‘Trust and candor has to go both ways,’ Pichai added.”
Pichai also explained that the poor relationship between employees and the top brass is a direct result of Google’s large size and the pandemic. Google employees were upset with their leaders prior the the COVID-19 pandemic. They stated they were frustrated with how Google handled sexual harassment complaints, lack of diversity issues, and sexism. Google employees formed their first union in January 2020.
Pichai would do better to admit Google has problems and actively work on fixing them. It would make him and the company appear positive in the media, not to mention better relationships with his employees.
Whitney Grace, January 4, 2022
Tech Giants: We Do What We Want. Got That?
January 3, 2022
I spotted “AT&T, Verizon Refuse US Request to Delay 5G Launch.” The main point of the story is that two big Baby Bells (remember them?) are showing their Bell Telephone DNA. The story states:
AT&T Inc. and Verizon Communications Inc. rejected a request from the U.S. federal transportation officials to delay their planned launch on January 5 of a new variation of 5G wireless services.
The US government is concerned that those outstanding 5G wave forms could have a negative impact on air traffic. I think that this means “cause crashes.” Of course, I am probably incorrect. However, the US government is worried the allegedly zippy 5G might disrupt a device: Maybe a passenger’s pacemaker or create interference when a pilot checks something on an official Boeing certified iPad.
Several observations have surfaced among my Beyond Search and DarkCyber teams:
- The government is late to the game… again. Lateness means either failing with the big tech crowd or getting a detention slip in the form of zero technical support for the annoying official
- Big tech makes clear that the US government is irrelevant and will do what it wants. The drill is outrage, hearing, an apology, and then no changes
- Significant encouragement for outfits like Amazon, Apple, Facebook, and Google to move forward: Deals with China, predatory pricing, cooperation on certain technical matters, and maintaining these firms’ alleged monopolies.
Net net: Quite a way to start 2022 because ignoring the 5G issue signals product managers to amp up their methods in order to generate more revenue.
Stephen E Arnold, January 3, 2022
How about That Smart Software?
January 3, 2022
In the short cut world of training smart software, minor glitches are to be expected. When an OCR program delivers 95 percent accuracy, that works out to five mistakes in every 100 words. When Alexa tells a child to put a metal object into a home electrical outlet, what do you expert? This is close enough for horse shoes.
Now what about the Google Maps of today, a maps solution which I find almost unusable. “Google Maps May Have Led Tahoe Travelers Astray During Snowstorm” quoted a Tweet from a person who is obviously unaware of the role probabilities play in the magical world of Google. Here’s the Tweet:
This is an abject failure. You are sending people up a poorly maintained forest road to their death in a severe blizzard. Hire people who can address winter storms in your code (or maybe get some of your engineers who are stuck in Tahoe right now on it).
Big deal? Of course not, Amazon and Google are focused on the efficiencies of machine-centric methods for identifying relevant, on point information. The probability is that most of the Amazon and Google outputs will be on the money. Google Maps rarely misses on pizza or the location of March Madness basketball games.
Severely injured children? Well, that probably won’t happen. Individuals lost in a snow storm? Well, that probably won’t happen.
The flaw in these giant firms’ methods are correct from these companies’ point of view in the majority of cases. A terminated humanoid or a driver wondering if a friendly forest ranger will come along the logging road? Not a big deal.
What happens when these smart systems output decisions which have ever larger consequences? Autonomous weapons, anyone?
Stephen E Arnold, January 3, 2021
Sentiment Analysis: A Comparison with Jargon
January 3, 2022
For anyone faced with choosing a sentiment extraction method, KD Nuggets offers a useful comparison in, “Sentiment Analysis API vs Custom Text Classification: Which One to Choose?” Data consultant and blogger Jérémy Lambert used a concrete dataset to demonstrate the pros and cons of each approach. For sentiment analysis, is team tested out Google Cloud Platform Natural Language API, Amazon Web Service Comprehend, and Microsoft Azure Text Analytics. Of those, Google looks like it performed the best. The custom text classification engines they used were Google Cloud Platform AutoML Natural Language and Amazon Web Service Comprehend Custom Classification. Lambert notes there are several other custom classification options they could have used, for example Monkey Learn, Twinwords, and Connexun. We observe no specialized solutions like Lexalytics were considered.
Before diving into the comparison, Lambert emphasizes it is important to distinguish between sentiment analysis and custom text classification. (See the two preceding links for more in-depth information on each.) He specifies:
“*Trained APIs [sentiment analysis engines] are based on models already trained by providers with their databases. These models are usually used to manage common use cases of : sentiment analysis, named entity recognition, translation, etc. However, it is always relevant to try these APIs before custom models since they are more and more competitive and efficient. For specific use cases where a very high precision is needed, it may be better to use AutoML APIs [custom text classification engines]. … AutoML APIs allow users to build their own custom model, trained on the user’s database. These models are trained on multiple datasets beforehand by providers.”
See the write-up for details on use cases, test procedures, performance results, and taxi-meter pricing. For those who want to skip to the end, here is Lambert’s conclusion:
“Both alternatives are viable. The choice between Sentiment Analysis API and Custom text classification must be made depending on the expected performance and budget allocated. You can definitely reach better performance with custom text classification but sentiment analysis performance remains acceptable. As shown in the article, sentiment analysis is much cheaper than custom text classification. To conclude, we can advise you to try sentiment analysis first and use custom text classification if you want to get better accuracy.”
Cynthia Murrell, January 3, 2022
Add Metal Detectors To Hacked Items List
January 3, 2022
It is a horrifying (and not surprising) fact that with the correct technology skills, bad actors can hack into anything. The obvious targets are security cameras, financial institution systems, mobile devices, and now metal detectors. Gizmodo reports that, “Walk-Through Metal Detectors Can Be Hacked, New Research Finds.”
Metal detectors are key security tools used by airports, convention centers, banks, schools, prisons, government buildings, and more. White hat researchers discovered that Garrett manufactured metal detectors contain nine software vulnerabilities. Hackers can exploit these security flaws to offline, alter data, or upset the metal detectors’ functionality.
Garrett received bad news about the vulnerability:
“Unfortunately, according to researchers with Cisco Talos, Garrett’s widely used iC module is in trouble. The product, which provides network connectivity to two of the company’s popular walk-through detectors (the Garrett PD 6500i and the Garrett MZ 6100), basically acts as a control center for the detector’s human operator: using a laptop or other interface, an operator can use the module to remotely control a detector, as well as engage in “real-time monitoring and diagnostics,” according to a website selling the product.”
The good news is that if Garrett updates its software, the security threats are neutralized. Bad actors exploit weaknesses for money, fame, and fun. It would be within their wheelhouse to shut down metal detectors in a major airport or important government building to see the resulting chaos. Knowing the mentality of these bad actor, they would be stupid enough to brag about it online.
Whitney Grace, January 3, 2022
The Value of Turning Off Malware Scanning: Allow Exchange to Function?
January 1, 2022
Happy New Year. Problems with Microsoft Exchange 2019? The fix is quite special and you can get some suggestions for getting mail working again from Reddit’s sysadmin forum. Try this link to learn how to by pass the malware engine. The trick is to disable malware scanning or use the bypass method described in the Reddit post.
Several thoughts:
- Useful issue for computer science classes in certain countries unfriendly toward the US to explore
- There is room for improvement in Microsoft software quality control processes
- This Microsoft Exchange issue matches nicely with netlogon and no-auth exchange RCE missteps.
Here’s the link to the fix: https://bit.ly/3FPNBYc
Outstanding work, Microsoft.
PS. The Register added another MSFT Happy New Year in its post “Going Round in Circles with Windows in Singapore.” There is an illustration of the helpful, detailed, extremely useful error notification. Outstanding work, price war cloud people called Redmondians.
Stephen E Arnold, January 1, 2022