Palantir Technologies: Following in the Footsteps of Northern Light and Autonomy
May 4, 2022
What market sector is the one least likely to resonate with race car fans? I would suggest that the third party Chinese vendor TopCharm23232 is an unlikely candidate. Another outlier might be PicRights, a fascinating copyright enforcement outfit relying on ageing technology from Israel.
What do you think about search and content processing vendors?
I spotted this ad in the Murdoch-owned Wall Street Journal which resides behind a very proper paywall.
The full page ad appeared in my Kentucky edition on May 3, 2022. I was interested when Northern Light, a vendor of search systems relying originally on open source technology shaped by Dr. Marc Krellenstein, sponsored a NASCAR vehicle. I wonder how my NASCAR fans were into Northern Light’s approach to content clustering? Some I suppose.
I also noted Autonomy plc’s sponsorship of an F-1 car and the company’s logo on the uniform of the soccer / football club Tottenham Hotspur. (That’s the club logo with a big chicken balancing on a hummingbird egg.)
How did the sponsorships work out? I am not sure about sales and closing deals, but hanging with the race car drivers and team engineers is allegedly a hoot.
Will Palantir’s technology provide the boost necessary to win the remaining F-1 races? I don’t do predictive analytics so, of course, Palantir is a winner. The stock on May 4, 2022, opened at $10.55. For purposes of comparison, Verint which is a company with some similar technology opened at $54.04. Verint does not do race cars from what I have heard.
Stephen E Arnold, May 4, 2022
Stephen E Arnold
Stephen E Arnold
NCC April Vendor Contracts: How to Be Slick and Lose Customer Trust
April 28, 2022
I read “Build Vs. Buy: Vendor Contract Shenanigans.” The write up is an excellent reminder of the character traits of MBAs and lawyers; that is, you lose if we provide you with a contract you sign without understanding. The article contains a number of examples of legal behavior which might strike some people as fraud. Oh, well, that is a signed contract, and your firm must comply. I love it when the lawyer tells a contracting officer, “Hey, we are sorry. These are standard terms.” Yep, standard for whom?
Let me highlight three of the methods used to inflict maximum gain for the vendor and delivering discomfort to the customer. Please, consult the original write up for the fourth item on the list.
First, the vendor (in this case, the Google) specifies that when the guaranteed level of service fails, the customer must get everyone in the chain to notify one another that the Googley service did not deliver. A failure to complete this notification within 30 days means you forfeit a “service credit.” (I don’t know what a service credit means, but I don’t think it means cash money.)
Second, the vendor collects the money before service begins. If you don’t use what you bought, there is no refund.
Third, sign our deal and our company will use your logo forever.
The MBAs and lawyers involved in deals with these types of clauses have an ideal rationalization: We are just doing our jobs.
Yes, these individuals are. Just following orders. Where have I heard that before?
Stephen E Arnold, April 28, 2022
NCC April Sentiment Like a Humanoid
April 27, 2022
Artificial intelligence algorithms are dumb when it coms to interpreting human emotions. Human emotions are extraordinary complex, especially when rendered in text or emojis. There is a goldmine of information for organizations to use to their advantage if only sentiment analysis could be perfected. Brandwatch is working on sentiment analysis perfection and discuss their latest endeavors in the blog post: “Interview: The Data Science Behind Brandwatch’s New Sentiment Analysis.”
Brandwatch recently deployed a new sentiment AI model to over one hundred million sources covered in Brandwatch Consumer Research and apps the company powers. The upgrade provides 18% better language accuracy, it is also multilingual, and add sentiment analysis to all languages. Sentiment analysis is a key component Brandwatch offers its customers, because it aids in assessing brand health, detects potential circuses, identifies advocates/detractors, and discovers positive and negatives topics associated with the brand.
Colin Sullivan is a Data Science Manager, who heads different Brandwatch projects involving linguistics and computational linguistics. Sullivan explained that Brandwatch wanted to implement a new way of analyzing sentiment, because the company wanted to use new state-of-the-art developments and simplify the process.
The new model uses transfer learning, which is how a human brain works. The model gains a general understanding of a task, then transfers its newly knowledge to a new task. It is an improved model because:
“One of the key advantages of this new approach is that it makes it more robust when dealing with more complex or nuanced language. The new model can see past things like misspellings or slang. Previously, supervised learning models would be restricted to a fixed set of known patterns during training, which did not come close to exhaustively capturing all linguistically plausible ways of expressing a concept. New state-of-the-art models are better able to re-use what it already knows when faced with new or rare patterns. The transfer learning approach means the model will take what it knows to fill in gaps…And it works in almost any language because we are not training for a new language each time. This also means it can handle a wider range of regional dialects and posts where someone switches between languages.”
The new model has a 60-75% accuracy rate of the sentiment in content. If that fact holds up, AI could soon understand sarcasm. It would be helpful if they could also detect fake reviews from Karens/Kyles or bots.
Whitney Grace, April 27, 2022
The Patching Play
April 25, 2022
I read “Patching Is Security Industry’s ‘Thoughts and Prayers’: Ex-NSA Man Aitel.” The former leader of ImmunitySec asserts that patching delivers a false sense of security. Other industry experts believe that patching has some value. Both are correct. In my opinion, both are missing an important aspects of patching software and systems to keep bad actors at bay.
What’s my view?
Patching — real or pretend — is a launch pad for marketing. A breach occurs and vendors have an opportunity to explain what steps have been taken to protect the software and services, partners, customers, and in some cases the vendors themselves. Wasn’t it Solar something?
Microsoft explained that bad actors marshaled a team of 1,000 programmers. That’s marketing because the bad actors were in that case countries, not disgruntled 40 years olds in a coffee shop.
The name of the game is cat and mouse. The bad actors find a flaw, exploit it, or sell it. The good actors respond the the issue and issue an alleged patch. The PR machines, which is like Jack Benny’s Maxwell with a transplanted Tesla electric motor fires up.
Will the wheels fall off? Haven’t they?
Stephen E Arnold, April 25, 2022
Enterprise Search Vendor Buzzword Bonanza!
April 25, 2022
Enterprise search vendors are similar to those two Red Bull-sponsored wizards who wanted to change aircraft—whilst in flight. How did that work out? The pilots survived. That aircraft? Yeah, Liberty, Liberty Mutual as the YouTube ads intone.
Enterprise search vendors want to become something different. Typical repositionings include customer support which entails typing in a word and scanning for matches and business intelligence which often means indexing content, matching words and phrases on a list, and generating alerts. There are other variations which include analyzing content and creating a report which tallies text messages from outraged customers.
Let’s check out reality. “Enterprise search” means finding information. Words and phrase are helpful. Users want these systems to know what is needed and then output it without asking the user to do anything. The challenge becomes assigning a jazzy marketing hook to make enterprise search into something more vital, more compelling, and more zippy.
Navigate to “What Should We Remember?” Bonanza. The diagram is a remarkable array of categories and concepts tailor-made for search marketers. Here’s an example of some of the zingy concepts:
- Zero-risk bias
- Social comparison
- Fundamental attribution
- Barnum effect — Who? The circus person?
Now mix in natural language processing, semantic analysis, entity extraction, artificial intelligence, and — my fave — predictive analytics.
How quickly will outfits in the enterprise search sector gravitate to these more impactful notions? Desperation is a motivating factor. Maybe weeks or months?
Stephen E Arnold, April 25, 2022
Dinging AMP after Years of Unknowing: Timely Marketing Perhaps?
April 22, 2022
In one of my Google monographs, I included a diagram showing Google as a digital walled garden. The idea is that a Google user would access the Google version of the Internet via Google. I documented this by referencing some Google patents which few read or bothered to match to Google’s vision for the really big new thing: The mobile Internet.
The Google rolled out AMP with some magic PR dust explaining that speed was good. I laughed. Yep, speed is good, but the shaping of content and funneling those data into, through, and out of the Google was way better. If you look at the world through wonky Google PR sparkles, good for you.
I read “Why Brave and DuckDuckGo are cracking down on Google’s AMP.” The key point in the write up is that these steps have been taken seven years after the AMP roll out and more than 15 years after I wrote The Google Legacy, Google Version 2.0, and Google: The Digital Gutenberg. Speedy for sure.
The write up states with the attendant “wow, this is such a bold move” prose:
Brave published a blog post saying it’s releasing a new feature called De-AMP that’ll redirect you to the publisher’s original page, instead of an AMP-based link. The feature is available in Nightly and Beta versions of the browser, and will be enabled by default in the upcoming 1.38 Desktop and Android versions. The firm said it’s working on porting these functions to its iOS browser at the moment. A day later, privacy-focused search engine DuckDuckGo posted on Twitter that its apps and extensions will redirect users to publishers’ non-AMP pages when they click on links in search results.
Translation: Avoid the Google version of the Internet. I could offer some examples of how Google reshapes on the fly certain types of content, but I am confident that you, gentle reader, are familiar with this mechanism, right?
Google does many interesting things? There is the quaint notion of quality and Google’s view of quality. There is the significance of time metadata and Google’s version of time in general and time metadata in particular. And more? You bet. But everyone knows these mechanisms, right? Absolutely because most people meet tell me they are search experts.
Net net: This strikes me as marketing.
Stephen E Arnold, April 22, 2022
Enterprise Search Vendors: Sure, Some Are Missing But Does Anyone Know or Care?
April 20, 2022
I came across a site called Software Suggest and its article “Coveo Enterprise Search Alternatives.” Wow. What’s a good word for bad info?
The system generated 29 vendors in addition to Coveo. The options were not in alphabetical order or any pattern I could discern. What outfits are on the list? Here are the enterprise search vendors for February 2022, the most recent incarnation of this list. My comments are included in parentheses for each system. By the way, an alternative is picking from two choices. This is more correctly labeled “options.” Just another indication of hippy dippy information about information retrieval.
AddSearch (Web site search which is not enterprise search)
Algolia (a publicly trade search company hiring to reinvent enterprise search just as Fast Search & Transfer did more than a decade ago)
Bonsai.io (another Eleasticsearch repackager)
Coveo (no info, just a plea for comments)
C Searcher(from HNsoft in Portugal. desktop search last updated in 2018 according to the firm’s Web site)
CTX Search (the expired certificate does bode well)
Datafari (maybe open source? chat service has no action since May 2021)
Expertrec Search Engine (an eCommerce solution, not an enterprise search system)
Funnelback (the name is now Squiz. The technology Australian)
Galaktic (a Web site search solution from Taglr, an eCommerce search service)
IBM Watson (yikes)
Inbenta (A Catalan outfit which shapes its message to suit the purchasing climate)
Indica Enterprise Search (based in the Netherlands but the name points to a cannabis plant)
Intrasearch (open source search repackaged with some spicy AI and other buzzwords)
Lateral (the German company with an office in Tasmania offers an interface similar to that of Babel Street and Geospark Analytics for an organization’s content)
Lookeen (desktop search for “all your data”. All?)
OnBase ECM (this is a tricky one. ISYS Search sold to Lexmark. Lexmark sold to Highland. Highland appears to be the proud possessor of ISYS Search and has grafted it to an enterprise content management system)
OpenText (the proud owner of many search systems, including Tuxedo and everyone’s fave BRS Search)
Relevancy Platform (three years ago, Searchspring Relevancy Platform was acquired by Scaleworks which looks like a financial outfit)
Sajari (smart site search for eCommerce)
SearchBox Search (Elasticsearch from the cloud)
Searchify (a replacement for Index Tank. who?)
SearchUnify (looks like a smart customer support system, a pitch used by Coveo and others in the sector)
Site Search 360 (not an enterprise search solution in my opinion)
SLI Systems (eCommerce search, not enterprise search, but I could be off base here)
Team Search (TransVault searches Azure Tenancy set ups)
Wescale (mobile eCommerce search)
Wizzy (the name is almost as interesting as the original Purple Yogi system and another eCommerce search system)
Wuha (not as good a name as Purple Yogi. A French NLP search outfit)
X1 Search (from Idea Labs, X1 is into eDiscovery and search)
This is quite an incomplete and inconsistent list from Software Suggest. It is obvious that there is considerable confusion about the meaning of “enterprise search.” I thought I provided a useful definition in my book “The Landscape of Enterprise Search,” published by Panda Press a decade ago. The book, like me, is not too popular or well known. As a result, the blundering around in eCommerce search, Web site search, application specific search, and enterprise search is painful. Who cares? No one at Software Suggest I posit.
My hunch is that this is content marketing for Coveo. Just a guess, however.
Stephen E Arnold, April xx, 2022
Microsoft: Twice Cooked PR with Ban Mao?
April 18, 2022
Going green is important. Microsoft is important. Therefore, Microsoft is going green. How that logic for you, gentle reader. The editors at Fast Company followed this line of reasoning and enjoyed a sizzling plate of twice cooked PR with ban mao in “Microsoft’s Hottest New Product Is a Wok.” Yep, a wok for the woke maybe?
The write up states:
The wok is part of Microsoft’s brand new all-electric kitchen at its headquarters outside Seattle, where nearly 50,000 employees are based. The company is adding 3 million square feet of offices and facilities, and the entire project is being designed to be powered by a vast geothermal system and produce zero carbon emissions. A big part of getting there was eliminating fossil fuels from its energy portfolio. And one of the biggest users of fossil fuels were the company’s kitchens.
I wonder if Microsoft and Fast Company looked at the Microsoft Azure server farms and calculated what percentage of the energy these installations consumed and then answered this question: How much of the energy consumed is of the going green, whale saving variety?
No.
No surprise. I would like a century egg too. I wonder if Fast Company has ordered some Microsoft ads to accompany the article.
Stephen E Arnold, April 18, 2022
DuckDuckGo and Filtering
April 18, 2022
I read “DuckDuckGo Removes Pirate Websites from Search Results: No More YouTube-dl?” The main thrust of the story is:
The private search engine, DuckDuckGo, has decided to remove pirate websites from its official search results.
DuckDuckGo is a metasearch engine. These are systems which may do some focused original spidering, but may send a user’s query to partner indexes. Then the results are presented to the user (which may be a human or a software robot). Some metasearch systems like Vivisimo invested some intellectual cycles in de-duplicating the results. (A helpful rule of thumb is to assume a 50 to 70 percent overlap in results from one Web search system to another.) IBM bought Vivisimo, and I have to admit that I have no idea what happened to the de-duplicating technology because … IBM.
There are more advanced metasearch systems. One example is Silobreaker, a system influenced by some Swedish wizards. The difference between a DuckDuckGo and an industrial strength system, in my opinion, is significant. Web search is an opaque service. Many behind-the-scenes actions take place, and some of the most important are not public disclosed in a way that makes sense to a person looking for pizza.
My question, Is DuckDuckGo actively filtering?” And “Why did this take so long?” And, “Is DuckDuckGo virtue signaling after its privacy misstep, or is the company snagged in a content marketing bramble?
I don’t know. My thoughts are:
- The editorial policies of metasearch systems should be disclosed; that is, we do this and we do that.
- Metasearch systems should disclose that many results are recycled and the provenance, age, and accuracy of the results are unknown to the metasearch provider?
- Metasearch systems should make clear exactly what the benefits of using the metasearch system are and why the provider of some search results are not as beneficial to the user; for example, which result is an ad (explicit or implicit), sponsored, etc.
Will metasearch systems embrace some of these thoughts? Nah. Those who use “free” Web search systems are in a cloud of unknowing.
Stephen E Arnold, April 18, 2022
A Question about Robot Scientist Methods
April 13, 2022
I read “Robot Scientist Eve Finds That Less Than One Third of Scientific Results Are Reproducible.” The write up makes a big deal that Eve (he, her, it, them) examined in a semi automated way 12,000 research papers. From that set 74 were “found” to be super special. Now of the 74, 22 were “found” to be reproducible. I think I am supposed to say, “Wow, that’s amazing.”
I am not ready to be amazed because one question arose:
Can Eve’s (her, her, it, them) results be replicated. What about papers about Shakespeare, what about high energy physics, and what about SAIL Snorkel papers?
Answers, anyone.
I have zero doubt that peer reviewed, often wild and crazy research results were from one of these categories:
- Statistics 101 filtered through the sampling, analytic, and shaping methods embraced by the researcher or researchers.
- A blend of some real life data with synthetic data generated by a method prized at a prestigious research university.
- A collection of disparate data smoothed until suitable for a senior researcher to output a useful research finding.
Why are data from researchers off the track? I believe the quest for grants, tenure, pay back to advisors, or just a desire to be famous at a conference attended by people who are into the arcane research field for which the studies are generated.
I want to point out that one third being sort of reproducible is a much better score than the data output from blue chip and mid tier consulting firms about mobile phone usage, cyber crime systems, and the number of computers sold in the last three month period. Much of that information is from the University of the Imagination. My hunch is that quite a few super duper scholars have a degree in marketing or maybe an MBA.
Stephen E Arnold, April 13, 2022