Datasette: Useful Tool for Crime Analysts

February 15, 2023

If you want to explore data sets, you may want to take a look at the “open source multi-tool for exploring and publishing data.” The Datasette Swiss Army knife “is a tool for exploring and publishing data.”

The company says,

It helps people take data of any shape, analyze and explore it, and publish it as an interactive website and accompanying API. Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of 42 tools and 110 plugins dedicated to making working with structured data as productive as possible.

A handful of demos are available. Worth a look.

Stephen E Arnold, February 15, 2023

Going after the Original Entitled Wizards of Wonder: Blue Chip Consultants

February 14, 2023

I read “The McKinseys and the Deloittes Have No Expertise in the Areas That They’re Advising In.” I think the wildly “that they’re advising in” would make some old-school editors uncomfortable. But grammar and usage aside, the Financial Times, the odd orange newspaper, has identified what might be called “The Once Emperor-Like Are Naked So Let’s Put Them on TikTok.” Well, not TikTok, but the blue chip consultants are in the spot light for a short time.

I noted this passage in an “interview” with the author of the book The Big Con, by Mariana Mazzucato and Rosie Collington:

The Big Con of the book’s title is not a crime; it’s a confidence trick. Consultancies and outsourcers, Mazzucato argues, know less than they claim, cost more than they seem to, and — over the long term — prevent the public sector developing in-house capabilities.

The article presents as “real” financial news something that most former employees of blue chip firms know: Get smart about nanotechnology. We have a client meeting at 9 am.” The sentence is emitted from a sleek partner at 6:15 pm on a Wednesday evening. Yep, that’s why professionals at blue chip consulting firms get paid reasonable money: Get smart fast, spout sentences which seem to be spontaneously helpful, and nod at the right places. The goal: Close a job and start billing and scope change and bill some more.

I liked this statement in the article:

“These are private companies, the McKinseys and the Deloittes, that have no expertise in the areas that they’re advising in.”

Accurate? Yep. Will blue chip consulting firms change? Nope. Will those who hire blue chip consulting firms change their ways? No.

But why?

We can answer the question next week by getting our consulting firm to lead a discussion with the government staff involved in determining next steps. Those next steps require defining a project, a statement of work, a procurement or an add on to an existing contract, and billing.

In short, a validation of the superior intellect of the blue chip firms and their wizards of wonder.

Stephen E Arnold, February 14, 2023

Is Your Doctor Good at Statistical Analysis? Sure, Sure, Not to Worry

February 14, 2023

I spotted an interesting story titled “FDA Has Now Cleared More Than 500 Healthcare AI Algorithms.” The write up states:

There are now more than 520 marker-cleared artificial intelligence (AI) medical algorithms available in the United States, according to the U.S. Food and Drug Administration (FDA) as of January 2023. The vast majority of these are related to medical imaging.

Missing are Google’s method for solving death and the IBM Watson cancer solutions.

Another factoid in the article is that smart software in non-clinical areas are blooming. The niches with action are:

Population health
Health tracking apps
Identifying and addressing gaps in health equity
Revenue cycle management streamlining
Hospital-wide monitoring for length of stay, bed turn over rates, early sepsis detection and readmissions
Data analytics for key performance indicators
Enabling better patient wellness and preventative care

The item I found intriguing is:

When reviewing AI products, the FDA’s Center for Devices and Radiological Health (CDRH) is considering a total product lifecycle-based regulatory framework for these technologies that would allow for modifications to be made from real-world learning and adaptation, while ensuring that the safety and effectiveness of the software as a medical device are maintained. Such a regulatory framework could enable the FDA and manufacturers to evaluate and monitor a software product from its premarket development to post-market performance. This approach could allow for the FDA’s regulatory oversight to embrace the iterative improvement power of AI and ML-based software as a medical device, while assuring patient safety.

The FDA does a superb job. It makes perfect sense that the government agency can embrace smart software. No problemo.

Stephen E Arnold, February 14, 2023

The Chinese Balloon: A Legacy of Loon?

February 14, 2023

Several of my monographs about the Google relied on text and link analysis of Google patent applications and patents. One of the patents was “Balloon Altitude Control Using Density Adjustment and or Volume Adjustment.” You may recall the Loon Balloon, a project to provide Internet access in locations where landlines or other delivery mechanisms were either not affordable, safe, or accessible due to an issue. (An “issue” could be a war, disease, or a populace with a leader who was un-Googley.)

What is the likelihood that China’s use of a Loon-like invention described in US9033274B2 coupled with some smart software has enabled the Middle Kingdom to enable global activities. Balloons can carry explosive devices, surveillance equipment, or electronics designed to screw up a range of electrical (wave) centric systems.

Probably just a random connection my brain generated. Probably nothing significant.

Stephen E Arnold, February 13, 2023

Summarize for a Living: Should You Consider a New Career?

February 13, 2023

In the pre-historic age of commercial databases, many people earned money by reading and summarizing articles, documents, monographs, and consultant reports. In order to prepare and fact check a 130 word summary of an article in the Harvard Business Review in 1980, the cost to the database publisher worked out to something like $17 to $25 per summary for what I would call general business information. (If you want more information about this number, write benkent2020@yahoo.com, and maybe you will get more color on the number.) Flash forward to the present, the cost for a human to summarize an article in the Harvard Business Review has increased. That’s why it is necessary to pay to access and print an abstract from a commercial online service. Even with yesterday’s technology, the costs are a killer. Now you know why software that eliminates the human, the editorial review, the fact checking, and the editorial policies which define what is indexed, why, and what terms are assigned is a joke to many of those in the search and retrieval game.

I mention this because if you are in the A&I business (abstracting and indexing), you may want to take a look at HunKimForks ChatGPT Arxiv Extension. The idea is that ChatGPT can generate an abstract which is certainly less fraught with cost and management hassles than running one of the commercial database content generation systems dependent on humans, some with degrees in chemistry, law, or medicine.

Are the summaries any good? For the last 40 years abstracts and summaries have been, in my opinion, degrading. Fact checking is out the window along with editorial policies, style guidelines, and considered discussion of index terms, classification codes, time handling and signifying, among other, useful knowledge attributes.

Three observations:

  1. Commercial database publishers may want to check out this early-days open source contribution
  2. Those engaged in abstracting, writing summaries of books, and generating distillations of turgid government documents (yep, blue chip consulting firms I an thinking of you) may want to think about their future
  3. Say “hello” to increasingly inaccurate outputs from smart software. Recursion and liquid algorithms are not into factual stuff.

Stephen E Arnold, February 13, 2023

Modern Research Integrity: Stunning Indeed

February 13, 2023

I read “The Rise and Fall of Peer Review.” The essay addresses what happens when a researcher submits a research paper to a research journal. Many “research” journals are owned by big professional publishing companies. If you are not familiar with that sector, think about a publishing club which markets to libraries and “research” institutions. No articles in “research” publications, no promotion. The method for determining accuracy is to ask experts to read submitted papers, make comments, and send a signal about value of the “research.” I served on the peer review panel for a year and quit. I am no academic, but I know doo doo when it is on my shoe.

Now I want to focus on one passage. Consider this statement:

Why don’t reviewers catch basic errors and blatant fraud? One reason is that they almost never look at the data behind the papers they review, which is exactly where the errors and fraud are most likely to be. In fact, most journals don’t require you to make your data public at all. You’re supposed to provide them “on request,” but most people don’t. That’s how we’ve ended up in sitcom-esque situations like ~20% of genetics papers having totally useless data because Excel autocorrected the names of genes into months and years. (When one editor started asking authors to add their raw data after they submitted a paper to his journal, half of them declined and retracted their submissions. This suggests, in the editor’s words, “a possibility that the raw data did not exist from the beginning.”)

Observations:

  1. There is exactly one commercial database which added corrections to its entries. Why? Accuracy is expensive and most publishers are not into corrections. I think the feature of that database has been in the trash heap for many, many years. The outfit which bought the database is not into excellence in anything but revenue and profit.
  2. I found it impossible to get access to [a] the author to whom I wanted to address a question directly; that is, on the telephone, or [b] to get the data on which the crazy statistical hoops were displayed. Hey, math is not the key differentiator for many researchers, getting tenure and grants are the prime movers. A peer reviewer with pointed questions? Sorry, no way.
  3. The professional publishers want to follow a process which shifts responsibility for publishing error-filled articles to the “procedure”, the peer reviewers, the editors, and probably the stray dog outside their headquarters. Everyone is responsible for mistakes except them.

Net net: Perhaps the notion of open source accuracy needs to be expanded beyond tweets and Facebook posts?

Stephen  E Arnold, February 14, 2023

Prabhakar in Paris: An Expensive Google Trip

February 13, 2023

Paris has good restaurants, and it has quite a few alert, well-educated people. So why did Google take the Prabhakar Smart Search Show to the City of Light? “Google Employees Criticize CEO Sundar Pichai for Rushed, Botched Announcement of GPT Competitor Bard” does not have an answer for me or for others either.

The write up states:

Staffers took to the popular internal forum Memegen [an in house Google thing] to express their thoughts on the Bard announcement, referring to it as “rushed,” “botched” and “un-Googley,” according to messages and memes viewed by CNBC.

But here’s the killer comment:

During Google’s Wednesday event, search boss Prabhakar Raghavan briefly shared some slides with examples of Bard’s capabilities. People tuning in expected to hear more, and some employees weren’t even aware of the event. One presenter forgot to bring a phone that was required for the demo. Meanwhile, people on Twitter began pointing out that an ad for Bard offered an incorrect description of a telescope used to take the first pictures of a planet outside our solar system.

Is Prabhakar the Red Skelton of smart software infused search? By the way, the turning point for Googzilla was the interaction between the company and Dr. Timnit Gebru. If you have not read the stochastic parrot, you may find it interesting.

Polly want Google management to be organized? Squawk:

Dear Sundar, the Bard launch and the layoffs were rushed, botched, and myopic…. [now make parrot sounds]

The next high school reunion for Sundar and Prabhakar will be interesting indeed.

Stephen E Arnold, February 13, 2023

Google Shows Its Smarts by Trimming Its Market Value

February 10, 2023

The title of this blog is Beyond Search. More than a decade ago, I wanted to have a place to put my observations about search and retrieval. Retirement was coming, and I was unable to put criticism of search baloney in the write ups I was paid to do. (Nope, I won’t name the publication.)

image

Art generated and probably owned by Craiyon, Dreamtime, Getty, Alamy, Shutterstock and any other outfit looking to make a buck surfing on legal water droplets. I sure did not create this picture.

That’s why I have not been going head over heels with the smart software revolution. I now point to articles that offer something I find either interesting, amusing, or certifiably whacky. Today, I want to call your attention to a statement I quite like which appeared in “Google Bard or Google Storyteller”. Here’s the quote:

The problem here isn’t just the mistake. It’s the fact that this mistake was highlighted as an example of what Google Bard could accomplish. Before releasing this information, there were likely many people involved at Google. None were competent enough to fact-check what they wanted to show the world. This is not only embarrassing, but it also casts many doubts about Google’s internal checks on its products and shows an astounding level of amateurism for one of the biggest companies in the world.

Do you recall the antics of Abbott and Costello or the Three Stooges? I wonder if this slip betwixt cup and lip is the first program of the 2023 season for the Sundar and Prabhakar Show, sponsored by Microsoft  and OpenAI where you just Bing it!

I can hear the announcer saying,

“With Jeff Dean, Marcus White, and special guest stars Larry Page and Sergey Brin. Here are Sundar and Prabhakar, who have just returned from a meeting at what’s left of Charlie’s Café where the talented duo were discussing smart search. We join Sundar and Prabhakar in the once glorious dining facility…”

What would the comedy script generated by Bard say? I don’t want to know because that loss in market value was a hoot appropriate for a thunder lizard with a broken leg in the snow.

Stephen E Arnold, February 10, 2023

Google Is Busy: What about YouTube Filtering?

February 10, 2023

We noted a reference to a video produced by Wendover Productions. I know zero about the outfit’s videos, but one caught my eye. The video is “How to Illegally Cro0ss the Mexico-US Border.” The video runs 14 minutes and has been online for more than a year.  The company describes its “aboutness” this way:

Wendover Productions is all about explaining how our world works. From travel, to economics, to geography, to marketing and more, every video will leave you with a little better understanding of our world. New videos go out every other Tuesday.

The video does not strike me as particularly helpful. But there are some interesting factoids:

  1. Barriers protect about one-third of the border
  2. Walk across is possible in certain locations. Maybe Brownsville, Texas
  3. Cross in remote areas but a walk of 20 to 30 miles may be necessary
  4. Humanitarian groups have set up water stations in certain area

New methods of dealing with certain border infractions are in place and some like the Anduril tower and drone method seem to have some value. The situation, of course, determines what steps are taken and what methods are employed by authorized officials.

I wanted to highlight this video as I have those which provide information about to obtain and hack commercial software.

My question is, “Has Google been sufficiently distracted by bonuses, possible revenue shrinkage, and Code Red cartwheels to have ignored what appears to be information facilitating illegal activity.

Another possibility is that Google’s method of identifying certain types of content like stealing software and entering the US illegally is of little interest to Googlers or its smart software.

A related question is, “What will slip through the gaps in the Bard system?”

Stephen E Arnold, February 10, 2023

The Microsoft Vision: Agent-Intermediated Computing

February 9, 2023

Not one single input from smart software. — Editor

[Mise en scène] Googzilla is towering over advertisers, hissing and brandishing its long talons. Then the creature turns its head. It tiny ear slits twitch. He hears a sound like “boink” or “thunk.” Distracted, the 25 year-old terror looks from the cowering advertisers and fixes his maroon-hued eyes on a almost insignificant figure. That entity is Satya Nadella, who has just ruing Sundar’s and Prabhakar’s high school reunion. The fear of answering the question “Hey, how did you guys miss this ChatGPT – Microsoft thing” is terrifying. Googzilla emits a plaintive “welp.” The advertisers back away and start walking toward Redmond. [Fade to black]

The shoe has dropped. Boink or thunk, depending on your perceptual equipment. “Microsoft’s AI-Powered Bing Will Challenge Google Search” reports “Microsoft may finally have figured out how to get you to use Bing.” The article adds a quote directly from the champion of high school reunions in India:

“All computer interaction is going to be mediated with an agent helping,” Chief Executive Satya Nadella said at a launch event at the company’s headquarters in Redmond, Washington. “We’re going to have this notion of a co-pilot that’s going to be there across every application.”

I won’t point out that “all” is a logical impossibility when it comes to humanoids and computer systems. But I will let that slide… because ChatGPT.

Kudos to Microsoft for pulling off the marketing play of the year. I know it is only early February 2023, but it may take something truly special to tap into the brush fires ignited by ChatGPT. Will Bing be better? Maybe? Will the ChatGPT thing frighten the allegedly monopolistic Google? I think it has.

There are several examples that illustrate the disarray of the Google radar system. First, the search beacon missed the incoming ChatGPT balloon. Hello, Prabhakar. Isn’t and maybe wasn’t that your job?

Then there was the startling Code Red. Yeah, that’s professional. OpenAI has been around six or seven years. Now it is Code Red. Situational awareness seems to be lacking where Googzilla hunts. This is a flaw in an apex predator, is it not?

The dollop of whipped cream on this torched cupcake was asking the former head chefs of data hoovering and search engine optimization as an spur to buying advertising to return to the wizard lair. Yep, Sundar asked Sergey and Larry to help out with the Code Red thing. Okay, but let’s recall the origin of the Google money machine. Wasn’t it Yahoo-Overture-GoTo? Yeah, I think it was and there was a legal hassle and a billion dollar sentiment to make the GOOG gleam like a sprightly young Googzilla.

The actions that cement the frenzy in the House of Google is the steady flow of “it’s coming,” “yes, it’s a demo,” and “okay, we bought an outfit that Sam Bankman Fried found interesting.” The problem is that “to be” does not close the gap with the ChatGPT riding its hyper-drive electronic motorcycle on Google’s private motorways.

Several observations:

  1. Will the OpenAI and ChatGPT thing help Microsoft address the security of its existing software and services? When?
  2. Will Microsoft milk the marketing buzz and return to business as usual: Killing printers, interfaces that baffle, and features that disrupt one’s activity on a Windows enabled system?
  3. Will Microsoft have an answer to those who would claim that smart software violates fair use of intellectual property?
  4. Will Microsoft be able to handle bias and outputs which lead to interesting but harmful outcomes like a student getting expelled after mommy and daddy paid $135,000 for tuition?

But for now. The payoff for Microsoft is the thrill of watching Googzilla squirm. And the “all” word? That’s just an illustration of the imprecision of Microsoftie speech.

Stephen E Arnold, February 9, 2023

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta