Issues with the Zuckbook Smart Software: Imagine That

May 10, 2022

I was neither surprised by nor interested in “Facebook’s New AI System Has a ‘High Propensity’ for Racism and Bias.” The marketing hype encapsulated in PowerPoint decks and weaponized PDF files on Arxiv paint fantastical pictures of today’s marvel-making machine learning systems. Those who have been around smart software and really stupid software for a number of years understand two things: PR and marketing are easier than delivering high-value, high-utility systems and smart software works best when tailored and tuned to quite specific tasks. Generalized systems are not yet without a few flaws. Addressing these will take time, innovation, and money. Innovation is scarce in many high-technology companies. The time and money factors dictate that “good enough” and “close enough for horseshoes” systems and methods are pushed into products and services. “Good enough” works for search because no one knows what is in the index. Comparative evaluations of search and retrieval is tough when users (addicts) operate within a cloud of unknowing. The “close enough for horseshoes” produces applications which are sort of correct. Perfect for ad matching and suggesting what Facebook pages or Tweets would engage a person interested in tattoos or fad diets.

The cited article explains:

Facebook and its parent company, Meta, recently released a new tool that can be used to quickly develop state-of-the-art AI. But according to the company’s researchers, the system has the same problem as its predecessors: It’s extremely bad at avoiding results that reinforce racist and sexist stereotypes.

My recollection is that the Google has terminated some of its wizards and transformed these professionals into Xooglers in the blink of an eye. Why? Exposing some of the issues that continue to plague smart software.

Those interns, former college professors, and start up engineers rely on techniques used for decades. These are connected together, fed synthetic data, and bolted to an application. The outputs reflect the inherent oddities of the methods; for example, feed the system images spidered from Web sites and the system “learns” what is on the Web sites. Then generalize from the Web site images and produce synthetic data. The who process zooms along and costs less. The outputs, however, have minimal information about that which is not on a Web site; for example, positive images of a family in a township outside of Cape Town.

The write up states:

Meta researchers write that the model “has a high propensity to generate toxic language and reinforce harmful stereotypes, even when provided with a relatively innocuous prompt.” This means it’s easy to get biased and harmful results even when you’re not trying. The system is also vulnerable to “adversarial prompts,” where small, trivial changes in phrasing can be used to evade the system’s safeguards and produce toxic content.

What’s new? These issues surfaced in the automated content processing in the early versions of the Autonomy Neuro Linguistic Programming approach. The fix was to retrain the system and tune the outputs. Few licensees had the appetite to spend the money needed to perform the retraining and reindexing of the processed content when the search results drifted into weirdness.

Since the mid 1990s, have developers solved this problem?

Nope.

Has the email with this information reached the PR professionals and the art history majors with a minor in graphic design who produce PowerPoints? What about the former college professors and a bunch of interns and recent graduates?

Nope.

What’s this mean? Here’s my view:

  1. Narrow applications of smart software can work and be quite useful; for example, the Preligens system for aircraft identification. Broad applications have to be viewed as demonstrations or works in progress.
  2. The MBA craziness which wants to create world-dominating methods to control markets must be recognized and managed. I know that running wild for 25 years creates some habits which are going to be difficult to break. But change is needed. Craziness is not a viable business model in my opinion.
  3. The over-the-top hyperbole must be identified. This means that PowerPoint presentations should carry a warning label: Science fiction inside. The quasi-scientific papers with loads of authors who work at one firm should carry a disclaimer: Results are going to be difficult to verify.

Without some common sense, the flood of semi-functional smart software will increase. Not good. Why? The impact of erroneous outputs will cause more harm than users of the systems expect. Screwing up content filtering for a political rally is one thing; outputting an incorrect medical action is another.

Stephen E Arnold, May 10, 2022

Do Marketers See You As Special? Nope.

May 9, 2022

I read “Forget Personalisation, It’s Impossible and It Doesn’t Work.” My hunch is that the idea that a zippy modern system would “know” a user, assemble an appropriate info-filter, and display what that individual required has lost traction. I remember Pointcast and Desktop Data which suggested a user could get the information he/she/it/them needed each day. My recollection is that individual information needs in business changed. Fiddling with the filters was a hassle. As a result, the services were novel at first and then became a hassle. Maybe automation via processes tuned to figure out what the user needed would make such services more useful. If memory serves, the increasing costs of making these systems work within budget and developer constraints were not very good. The most recent example is my explanation of how a Google alert is about half right or half wrong when it flags an item I am supposed to need. See this “Cheerleading” article.

The Forget Personalisation write up calls individuation “the worst idea in the marketing industry.” The statement is not exactly a vote of confidence, is it? The article states:

There’s just one little problem with personalisation: it doesn’t make any sense.

I thought marketing types were optimists. I am wrong again.

The article includes some factoids about the accuracy of third party data. These are infobits which allows marketers and investigators to pinpoint behaviors and even identify people. Here’s what the article reports as actual factual:

Spoiler alert: it’s not. Most third-party data is, to put it politely, garbage. In an academic study from MIT and Melbourne Business School, researchers decided to test the accuracy of third-party marketing data. So, how accurate is gender targeting? It’s accurate 42.3% of the time. How accurate is age targeting? It’s accurate between 4% and 44% of the time. And those are the numbers for the leading global data brokers.

I assume that this is a news flash because informed individuals from investigative reporters at the Wall Street Journal to law enforcement administrators assume that data gathered from clicks, apps, and other high value inputs are “accurate.” Well, sometimes yes, but in my experience 50 to 75 percent accuracy is darned good. Lower scores are common. The 95 percent accuracy is doable under certain circumstances.

What’s the fix? Once again marketers have the answer. Keep in mind that many marketers majored in business administration or art history. Just sayin’. Note this solutions from the cited article:

Marketers would be much better off investing in ‘performance branding’; in other words, one-size-fits-most creative that speaks to the common category needs of all potential buyers, all the time. This is a much simpler approach that also happens to be supported by the evidence. Reach is, and has always been, the greatest predictor of marketing success.

I think this means TikTok. What do you think?

And the future? Impersonalization. And how does Marketing Week know this? Here’s the source of the insight:

Gartner predicts 80% of marketers will abandon personalisation by 2025.

Yep, Gartner. Wow. Solid indeed.

Net net: Those marketing types are on the beam. What else does not work in marketing? Smart ad matching to a user query?

Stephen E Arnold, May 9, 2022

Palantir Technologies: Following in the Footsteps of Northern Light and Autonomy

May 4, 2022

What market sector is the one least likely to resonate with race car fans? I would suggest that the third party Chinese vendor TopCharm23232 is an unlikely candidate. Another outlier might be PicRights, a fascinating copyright enforcement outfit relying on ageing technology from Israel.

What do you think about search and content processing vendors?

I spotted this ad in the Murdoch-owned Wall Street Journal which resides behind a very proper paywall.

palantir fix 1

The full page ad appeared in my Kentucky edition on May 3, 2022. I was interested when Northern Light, a vendor of search systems relying originally on open source technology shaped by Dr. Marc Krellenstein, sponsored a NASCAR vehicle. I wonder how my NASCAR fans were into Northern Light’s approach to content clustering? Some I suppose.

I also noted Autonomy plc’s sponsorship of an F-1 car and the company’s logo on the uniform of the soccer / football club Tottenham Hotspur. (That’s the club logo with a big chicken balancing on a hummingbird egg.)

How did the sponsorships work out? I am not sure about sales and closing deals, but hanging with the race car drivers and team engineers is allegedly a hoot.

Will Palantir’s technology provide the boost necessary to win the remaining F-1 races? I don’t do predictive analytics so, of course, Palantir is a winner. The stock on May 4, 2022, opened at $10.55. For purposes of comparison, Verint which is a company with some similar technology opened at $54.04. Verint does not do race cars from what I have heard.

Stephen E Arnold, May 4, 2022

Stephen E Arnold

Stephen E Arnold

NCC April Vendor Contracts: How to Be Slick and Lose Customer Trust

April 28, 2022

I read “Build Vs. Buy: Vendor Contract Shenanigans.” The write up is an excellent reminder of the character traits of MBAs and lawyers; that is, you lose if we provide you with a contract you sign without understanding. The article contains a number of examples of legal behavior which might strike some people as fraud. Oh, well, that is a signed contract, and your firm must comply. I love it when the lawyer tells a contracting officer, “Hey, we are sorry. These are standard terms.” Yep, standard for whom?

Let me highlight three of the methods used to inflict maximum gain for the vendor and delivering discomfort to the customer. Please, consult the original write up for the fourth item on the list.

First, the vendor (in this case, the Google) specifies that when the guaranteed level of service fails, the customer must get everyone in the chain to notify one another that the Googley service did not deliver. A failure to complete this notification within 30 days means you forfeit a “service credit.” (I don’t know what a service credit means, but I don’t think it means cash money.)

Second, the vendor collects the money before service begins. If you don’t use what you bought, there is no refund.

Third, sign our deal and our company will use your logo forever.

The MBAs and lawyers involved in deals with these types of clauses have an ideal rationalization: We are just doing our jobs.

Yes, these individuals are. Just following orders. Where have I heard that before?

Stephen E Arnold, April 28, 2022

NCC April Sentiment Like a Humanoid

April 27, 2022

Artificial intelligence algorithms are dumb when it coms to interpreting human emotions. Human emotions are extraordinary complex, especially when rendered in text or emojis. There is a goldmine of information for organizations to use to their advantage if only sentiment analysis could be perfected. Brandwatch is working on sentiment analysis perfection and discuss their latest endeavors in the blog post: “Interview: The Data Science Behind Brandwatch’s New Sentiment Analysis.”

Brandwatch recently deployed a new sentiment AI model to over one hundred million sources covered in Brandwatch Consumer Research and apps the company powers. The upgrade provides 18% better language accuracy, it is also multilingual, and add sentiment analysis to all languages. Sentiment analysis is a key component Brandwatch offers its customers, because it aids in assessing brand health, detects potential circuses, identifies advocates/detractors, and discovers positive and negatives topics associated with the brand.

Colin Sullivan is a Data Science Manager, who heads different Brandwatch projects involving linguistics and computational linguistics. Sullivan explained that Brandwatch wanted to implement a new way of analyzing sentiment, because the company wanted to use new state-of-the-art developments and simplify the process.

The new model uses transfer learning, which is how a human brain works. The model gains a general understanding of a task, then transfers its newly knowledge to a new task. It is an improved model because:

“One of the key advantages of this new approach is that it makes it more robust when dealing with more complex or nuanced language. The new model can see past things like misspellings or slang. Previously, supervised learning models would be restricted to a fixed set of known patterns during training, which did not come close to exhaustively capturing all linguistically plausible ways of expressing a concept. New state-of-the-art models are better able to re-use what it already knows when faced with new or rare patterns. The transfer learning approach means the model will take what it knows to fill in gaps…And it works in almost any language because we are not training for a new language each time. This also means it can handle a wider range of regional dialects and posts where someone switches between languages.”

The new model has a 60-75% accuracy rate of the sentiment in content. If that fact holds up, AI could soon understand sarcasm. It would be helpful if they could also detect fake reviews from Karens/Kyles or bots.

Whitney Grace, April 27, 2022

The Patching Play

April 25, 2022

I read “Patching Is Security Industry’s ‘Thoughts and Prayers’: Ex-NSA Man Aitel.” The former leader of ImmunitySec asserts that patching delivers a false sense of security. Other industry experts believe that patching has some value. Both are correct. In my opinion, both are missing an important aspects of patching software and systems to keep bad actors at bay.

What’s my view?

Patching — real or pretend — is a launch pad for marketing. A breach occurs and vendors have an opportunity to explain what steps have been taken to protect the software and services, partners, customers, and in some cases the vendors themselves. Wasn’t it Solar something?

Microsoft explained that bad actors marshaled a team of 1,000 programmers. That’s marketing because the bad actors were in that case countries, not disgruntled 40 years olds in a coffee shop.

The name of the game is cat and mouse. The bad actors find a flaw, exploit it, or sell it. The good actors respond the the issue and issue an alleged patch. The PR machines, which is like Jack Benny’s Maxwell with a transplanted Tesla electric motor fires up.

Will the wheels fall off? Haven’t they?

Stephen E Arnold, April 25, 2022

Enterprise Search Vendor Buzzword Bonanza!

April 25, 2022

Enterprise search vendors are similar to those two Red Bull-sponsored wizards who wanted to change aircraft—whilst in flight. How did that work out? The pilots survived. That aircraft? Yeah, Liberty, Liberty Mutual as the YouTube ads intone.

Enterprise search vendors want to become something different. Typical repositionings include customer support which entails typing in a word and scanning for matches and business intelligence which often means indexing content, matching words and phrases on a list, and generating alerts. There are other variations which include analyzing content and creating a report which tallies text messages from outraged customers.

Let’s check out reality. “Enterprise search” means finding information. Words and phrase are helpful. Users want these systems to know what is needed and then output it without asking the user to do anything. The challenge becomes assigning a jazzy marketing hook to make enterprise search into something more vital, more compelling, and more zippy.

Navigate to “What Should We Remember?” Bonanza. The diagram is a remarkable array of categories and concepts tailor-made for search marketers. Here’s an example of some of the zingy concepts:

  • Zero-risk bias
  • Social comparison
  • Fundamental attribution
  • Barnum effect — Who? The circus person?

Now mix in natural language processing, semantic analysis, entity extraction, artificial intelligence, and — my fave — predictive analytics.

How quickly will outfits in the enterprise search sector gravitate to these more impactful notions? Desperation is a motivating factor. Maybe weeks or months?

Stephen E Arnold, April 25, 2022

Dinging AMP after Years of Unknowing: Timely Marketing Perhaps?

April 22, 2022

In one of my Google monographs, I included a diagram showing Google as a digital walled garden. The idea is that a Google user would access the Google version of the Internet via Google. I documented this by referencing some Google patents which few read or bothered to match to Google’s vision for the really big new thing: The mobile Internet.

The Google rolled out AMP with some magic PR dust explaining that speed was good. I laughed. Yep, speed is good, but the shaping of content and funneling those data into, through, and out of the Google was way better. If you look at the world through wonky Google PR sparkles, good for you.

I read “Why Brave and DuckDuckGo are cracking down on Google’s AMP.” The key point in the write up is that these steps have been taken seven years after the AMP roll out and more than 15 years after I wrote The Google Legacy, Google Version 2.0, and Google: The Digital Gutenberg. Speedy for sure.

The write up states with the attendant “wow, this is such a bold move” prose:

Brave published a blog post saying it’s releasing a new feature called De-AMP that’ll redirect you to the publisher’s original page, instead of an AMP-based link. The feature is available in Nightly and Beta versions of the browser, and will be enabled by default in the upcoming 1.38 Desktop and Android versions. The firm said it’s working on porting these functions to its iOS browser at the moment. A day later, privacy-focused search engine DuckDuckGo posted on Twitter that its apps and extensions will redirect users to publishers’ non-AMP pages when they click on links in search results.

Translation: Avoid the Google version of the Internet. I could offer some examples of how Google reshapes on the fly certain types of content, but I am confident that you, gentle reader, are familiar with this mechanism, right?

Google does many interesting things? There is the quaint notion of quality and Google’s view of quality. There is the significance of time metadata and Google’s version of time in general and time metadata in particular. And more? You bet. But everyone knows these mechanisms, right? Absolutely because most people meet tell me they are search experts.

Net net: This strikes me as marketing.

Stephen E Arnold, April 22, 2022

Enterprise Search Vendors: Sure, Some Are Missing But Does Anyone Know or Care?

April 20, 2022

I came across a site called Software Suggest and its article “Coveo Enterprise Search Alternatives.” Wow. What’s a good word for bad info?

The system generated 29 vendors in addition to Coveo. The options were not in alphabetical order or any pattern I could discern. What outfits are on the list? Here are the enterprise search vendors for February 2022, the most recent incarnation of this list. My comments are included in parentheses for each system. By the way, an alternative is picking from two choices. This is more correctly labeled “options.” Just another indication of hippy dippy information about information retrieval.

AddSearch (Web site search which is not enterprise search)

Algolia (a publicly trade search company hiring to reinvent enterprise search just as Fast Search & Transfer did more than a decade ago)

Bonsai.io (another Eleasticsearch repackager)

Coveo (no info, just a plea for comments)

C Searcher(from HNsoft in Portugal. desktop search last updated in 2018 according to the firm’s Web site)

CTX Search (the expired certificate does bode well)

Datafari (maybe open source? chat service has no action since May 2021)

Expertrec Search Engine (an eCommerce solution, not an enterprise search system)

Funnelback (the name is now Squiz. The technology Australian)

Galaktic (a Web site search solution from Taglr, an eCommerce search service)

IBM Watson (yikes)

Inbenta (A Catalan outfit which shapes its message to suit the purchasing climate)

Indica Enterprise Search (based in the Netherlands but the name points to a cannabis plant)

Intrasearch (open source search repackaged with some spicy AI and other buzzwords)

Lateral (the German company with an office in Tasmania offers an interface similar to that of Babel Street and Geospark Analytics for an organization’s content)

Lookeen (desktop search for “all your data”. All?)

OnBase ECM (this is a tricky one. ISYS Search sold to Lexmark. Lexmark sold to Highland. Highland appears to be the proud possessor of ISYS Search and has grafted it to an enterprise content management system)

OpenText (the proud owner of many search systems, including Tuxedo and everyone’s fave BRS Search)

Relevancy Platform (three years ago, Searchspring Relevancy Platform was acquired by Scaleworks which looks like a financial outfit)

Sajari (smart site search for eCommerce)

SearchBox Search (Elasticsearch from the cloud)

Searchify (a replacement for Index Tank. who?)

SearchUnify (looks like a smart customer support system, a pitch used by Coveo and others in the sector)

Site Search 360 (not an enterprise search solution in my opinion)

SLI Systems (eCommerce search, not enterprise search, but I could be off base here)

Team Search (TransVault searches Azure Tenancy set ups)

Wescale (mobile eCommerce search)

Wizzy (the name is almost as interesting as the original Purple Yogi system and another eCommerce search system)

Wuha (not as good a name as Purple Yogi. A French NLP search outfit)

X1 Search (from Idea Labs, X1 is into eDiscovery and search)

This is quite an incomplete and inconsistent list from Software Suggest. It is obvious that there is considerable confusion about the meaning of “enterprise search.” I thought I provided a useful definition in my book “The Landscape of Enterprise Search,” published by Panda Press a decade ago. The book, like me, is not too popular or well known. As a result, the blundering around in eCommerce search, Web site search, application specific search, and enterprise search is painful. Who cares? No one at Software Suggest I posit.

My hunch is that this is content marketing for Coveo. Just a guess, however.

Stephen E Arnold, April xx, 2022

Microsoft: Twice Cooked PR with Ban Mao?

April 18, 2022

Going green is important. Microsoft is important. Therefore, Microsoft is going green. How that logic for you, gentle reader. The editors at Fast Company followed this line of reasoning and enjoyed a sizzling plate of twice cooked PR with ban mao in “Microsoft’s Hottest New Product Is a Wok.” Yep, a wok for the woke maybe?

The write up states:

The wok is part of Microsoft’s brand new all-electric kitchen at its headquarters outside Seattle, where nearly 50,000 employees are based. The company is adding 3 million square feet of offices and facilities, and the entire project is being designed to be powered by a vast geothermal system and produce zero carbon emissions. A big part of getting there was eliminating fossil fuels from its energy portfolio. And one of the biggest users of fossil fuels were the company’s kitchens.

I wonder if Microsoft and Fast Company looked at the Microsoft Azure server farms and calculated what percentage of the energy these installations consumed and then answered this question: How much of the energy consumed is of the going green, whale saving variety?

No.

No surprise. I would like a century egg too. I wonder if Fast Company has ordered some Microsoft ads to accompany the article.

Stephen E Arnold, April 18, 2022

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta