Palantir Technologies: A Problem for Intelware Competitors?

September 24, 2020

The Palantir Technologies initial public offering is looming. Pundits are excited; for example, “Palantir Has A Long Uphill Battle Towards Customer Acquisition, But Benefits From Stickiness And Contract Expansion” makes clear that the journey to profitability may be like the Beatles observed: A long and winding road. Others are focused on churn; for example, “5 things to Know about Palantir’s Upcoming IPO.” DarkCyber’s response: “Just five?”

The issue is intelware. Many companies have tried to convert selling to law enforcement, intelligence agencies, and regulators into a billion dollar software and services business. There are some success stories; for example, Booz Allen fits the bill. The company sells time. The company has its own software, not much, but it exists. The company cheerleads, which is a nice way to say that for money “experts” will talk about promising products from the competitive marketplace.

Palantir is more like Autonomy than a blue-chip consulting firm. Autonomy played the “secret black box” chip with its neuro-linguistic programming. It worked until it did not. The firm licensed its black box to BAE Systems in the 1990s. The Autonomy marketing machine then generated revenue slowly and steadily. Then Autonomy acquired companies and cranked up its sales machine. At “peak Autonomy,” the well managed outfit Hewlett Packard, grabbed a brass ring with Autonomy engraved on it.  The cost was north of $10 billion and years of legal bills. Autonomy was a publicly traded company, and it had a revenue track record dating from 1996. The HP deal was completed in October 2011. That means that the FY2010 data give us an idea about how much secret black box software can generate with “advanced” software, great marketing, and demanding management. The revenue for Autonomy after 15 years was in the neighborhood of $870 million.

7 graph A

One of Palantir Gotham’s innovations: A right mouse click displays a wheel of choices. The interface is definitely jazzier than that of Analyst’s Notebook, now owned by IBM.

Palantir Technologies opened for business in 2003. The company has been in business for 17 years. Yep, that’s two years longer than Autonomy. And what is Palantir’s alleged revenue for the last fiscal year? $742 million. The company’s advantages were the support of Peter Thiel (a Silicon Valley Thor), secrecy, a method for importing ANB files (if you don’t know what this is, well, what can I tell you in a free blog post?), and okay sales and so-so marketing. (One of Palantir’s innovations was a wheel of choices, not Bayesian methods wrapped in mystery.)

If my math is correct, Autonomy generated $128 million more revenue that Autonomy. If one uses 2011 dollars, not the Rona roiled 2020 dollars, the difference is more like $400 million, give or take $20 million or so. Yep, Autonomy appears to have outperformed Palantir: Less time, more revenue.

What?

Why?

Who?

How?

Let’s take each question.

First, what? The lackluster performance of Palantir Technologies illustrates the difficulty intelware companies, even ones with great advantages like the aforementioned ANB filter, have making really big money quickly. Remember. To generate less revenue than Autonomy, Palantir required $2.6 billion in funding. DarkCyber thinks that patient investors may be nervous about their investment which could melt away like a real snowflake. You can work out the math. Take 17 years of losses, subtract the revenue generated over 17 years, add in some interest just for spice, and mix into a pressurized container containing the fumes of burning a big cash pile. Read more

Dark Patterns: Is the Future of Free Video Editing Software Duplicity, Carelessness, and Indifference?

August 31, 2020

One of the DarkCyber team suggested a run down of three free video editing software solutions. We had just finished a couple of our for-fee write ups about technology related to warfighting, and I concluded that the group wanted a break from million watt beam weapons.

I said, “Okay, just use a machine we don’t rely on for real work.” Stephanie was thrilled when Ben said he would help. The three “free” software solutions these two set about installing were:

DaVinci Resolve, allegedly “the standard for high end post production and finishing on more Hollywood feature films, television shows and commercials than any other software.” You can get a free copy at this link. (There is a $300 version too.)

HitFilm Express, allegedly “a free video editing software with professional-grade VFX tools and everything you need to make awesome content, films or gaming videos.” You can get a free copy at this link.

Shotcut, a free, open source, cross platform video editor. You can get a copy at this link.

We never got to the review. We were trapped in what sure looks like the FXHome / HitFilm Express dark pattern. It was a swamp populated by creatures dependent on auto reply email, bizarre instructions, and names like “Dibs” and “Joe.” So wholesome, yet so frustrating despite the friendly monikers.

This blog post is about dark patterns, not the video editing software. Sorry, Stephanie (the team member who cooked up the idea for the story.) Read on to find out why DarkCyber cares about a single firm and its enthusiastic pursuit of dark patterns.

The illustration below is a depiction of Dante’s Inferno. About eight layers down is the Dark Pattern of FXHome. That’s better than spending every day, all day with Beelzebub and the gang.

What’s a dark pattern?

The phrase means, according to the ever reliable Wikipedia, “A user interface that has been carefully crafted to trick users into doing things, such as buying insurance with their purchase or signing up for recurring bills.”

Stephanie tried to install the software and was greeted with a Web page presenting her with options to upgrade the free software by purchasing $25 to $50 dollar bundles of macros and pre-sets. Puzzled, she retrieved the details for the accounts we use to purchase software, pay for subscriptions, and buy crap from Amazon.

I ignored her grumbling, but I noticed when two of my engineers were standing behind her staring at the screen and getting that weird look in their eyes when something does not compute. I walked over to the group and said, “When will you finish your reviews of these three tools?”

Stephanie said, “I am running behind. I spent yesterday and today trying to get the software to work. Apparently someone installed a version of HitFilm Express last year, and now FXHome took the money, sent a series of steps, and nothing works.”

I said, “Okay, write the company. Explain what happened and get help to install the software.”

My two engineers nodded and walked away. This, in my experience, meant that the HitFilm Express software was something that presented numerous challenges. Researching and analyzing EMP technology was more appealing than not-so-free software.

I told Stephanie to give me the user name and password she used to buy the software. I happily logged in from a different machine, created a user name and password, saw the same difficult to evade plea to buy add-in packs, and I bought a $39 pack. The video editor came up but no add in software.

Now I was intrigued. Two installations. Almost $80US down a rat hole and no special add in packs. I told my engineers to log in, get the install information, and see if each could get the software to work.

Nope. FXHome has a system to take money. FXHome does not have a functional, reliable system to deliver what the customer purchased.

Now I am thinking cyber fraud. Call me silly, but I am a suspicious person, and when we write about next generation weapons, what type of customers do we have? Certainly not the Vatican or Green Peace.

I found a customer support email which is managed by “smart” software. The email to which I was directed is support@fxhome.com and along the line of a series of email exchanges over the span of nine days a human included his/her name. That individual identified himself/herself as Dibs McCallum.

The dark patterns we believe the user interface implements for the free software includes these elements:

  1. Blandishments to purchase upgrades before allowing downloads
  2. Instructions for installing software which do not install software
  3. Customer service interfaces intended to frustrate those seeking information; for example, the FXHome system strips attachments even though people or bots like Dibs McCallum request them and your truly attaches them. Even more dutifully I resend the attachments and receive zero acknowledgement or information about the failure.

Where am I? Well, definitely there is no review of FXHome. It is tough to write about software which does not function. The upside is that I have an anecdote for my next cyber crime lecture. As we were editing this story, PayPal reported a refund of $39. FXHome still has $39 and we have no functioning software.

When I step back and look at this series of events involving three of my team and the ever helpful Dibs McCallum, who insisted that attachments showing the unhelpful error messages HitFilm Express displayed, did not arrive.

Then there was this email:

Allow me to explain. You buy from us. If you want a refund within 14 days you get one.
That is why I have refunded both your order 0000000000000 for $39 that you made by credit card under the email seaky2000@yahoo.com and also your order 0000000000000 for $39 that you made via PayPal that you made under the email 00@arnoldit.com. Both amounts will appear in your prospective credit card and PayPal statements within the next 5-10 working days. Though most likely far sooner. This does mean your software packs will no longer work of course. Those effects will be deactivated and you are left with the free HitFilm Express without the extra content. It is always best to remember what email you use for purchases as it can be confusing if you habitually use more than one email. We are always dealing with this confusion with customers. Very common.
Best Regards, Joe Gould, Business Coordinator

Notice the phrase “We are always dealing with this confusion”.

Yeah, Joe said, “Always.” What’s that old saw about doing the same thing over and over? Was it ground hog day or one of Dante’s circles of Hell?

The dark pattern is apparently accidental. A situation exists which creates an “always” situation. Why not figure out changes to the system to eliminate an “always” problem. Why not think through making the interface work with a customer, not against the customer. Why not skip the “buy more add in packs”? Just charge people money.

What’s free mean? Upsells, confusing purchase options, and a “system” designed to make the craziness of Microsoft customer support for non-installable $0.99 HEVC codecs look like a paragon of lucidity.

One answer is that it earned this write up in Beyond Search and DarkCyber. It has converted sweet Stephanie into a termagant and HitFilm Express hater. (Good work that.)

Observations:

  • Generating sustainable revenue is difficult. If a product is “good,” people will pay for it. If a product is not so good, carelessness, indifference, or laziness generates “buy this, then that” solutions. Helpful? Not so much. Suggestion for FXHome: Less weird orange color and more begging for dollar options like Indiegogo or Patreon, among others?
  • Competing against Adobe, Apple, Magix, and other for-fee video editing programs is difficult. Yes, DarkCyber understands that FXHome needs revenue. Suggestion: Why not sell a subscription to upgrades?
  • Relying on an interface and the people who conceived it may not be a winning tactic. Staff changes and additional inputs may provide the creative spark that moves beyond what sure look like dark patterns. Suggestion: Skip the hear, speak, and see no evil approach to your current upgrade interface. Listen and fix the problem. “Always”. Wow, that’s an endorsement of clear thinking.

Is DarkCyber suspicious? Yep. FXHome could be a YouTube video titled UXMoan.

Stephen E Arnold, August 31, 2020

Google: Human Data Generators

July 29, 2020

DarkCyber spotted this interesting article, which may or may not be true. But it is fascinating. The story is “Google Working on Smart Tattoos That Turn Skin into Living Touchpad.” The write up states:

Google is working on smart tattoos that, when applied to skin, will transform the human body into a living touchpad via embedded sensors. Part of Google Research, the wearable project is called “SkinMarks” that uses rub-on tattoos. The project is an effort to create the next generation of wearable technology devices…

DarkCyber believes that the research project makes it clear that Google is indeed intent collecting personal data. Where will the tattoo be applied? Forehead in Central America street gang fashion?

image

Russian prisoner style with appropriate Google iconography?

image

A tasteful tramp stamp approach?

image

The possibilities are plentiful if the report is accurate.

Stephen E Arnold, July 29, 2020

Digital Fire hoses: Destructive and Must Be Controlled by Gatekeepers

July 16, 2020

Let’s see how many individualistic thinkers I have offended with my headline. I apologize, but I am thinking about the blast of stories about the most recent Twitter “glitch”: “Apple, Biden, Musk and Other High-Profile Twitter Accounts Hacked in Crypto Scam.”

Are you among the individuals whom I am offending in this essay?

First, we have the individuals who did not believe my observations made in my ASIS Eagleton Lecture 40 years ago. Flows of digital information are destructive. The flows erode structures like societal norms, logical constructs, and organizational systems. Yep, these are things. Unfettered flows of information cut them down, efficiently and steadily. In some cases, the datum can set up something like this:

image

Those nuclear reactions are energetic in some cases.

Second, individuals who want to do any darn thing they want. These individuals form a cohort—either real or virtual—and have at it. I have characterized this behavior in my metaphor of the high school science club. The idea is that anyone “smart” thinks that his or her approach to a problem is an intelligent one. Sufficiently intelligent individuals will recognize the wisdom of the idea and jump aboard. High school science clubs can be a useful metaphor for understanding the cute and orthogonal behavior of some high technology firms. It also describes the behavior of a group of high school students who use social media to poke fun or “frame” a target. Some nation states direct their energies at buttons which will ignite social unrest or create confusion. Thus, successful small science clubs can grow larger and be governed — if that’s the right word — by high school science club management methods. That’s why students at MIT put weird objects on buildings or perform cool pranks. Really cool, right?

Third, individuals who do not want gatekeepers. I use the phrase “adulting” to refer to individuals able to act in an informed, responsible, and ethical manner when deciding what content becomes widely available and what does not. I used to work for an outfit which published newspapers, ran TV stations, and built commercial databases. The company at that time had the “adulting” approach well in hand. Individuals who decry informed human controls. It is time to put thumbs in digital dikes.

Read more

Search: Contentious and Increasingly Horrible

May 25, 2020

I dropped enterprise search, commercial search, and vertical search to the bottom of my “Favorite Topics” list years ago.

Why?

The individuals popping up and off at conferences were disconnected from the realities of looking for information under stressful circumstances.

image

Hey, big rocks, how did you move from that quarry kilometers away and get yourselves smoothed down? Just like modern online search systems, you won’t get an answer. Finding information relevant to a query is as difficult as getting megalithic stones to become Chatty Kathies.

The thumb typing crowd, some are now in their mid forties, ASSUME that search has to think for the stupid user.

The techniques range from smart software which skews results in what are to an experienced researcher stupid ways. For those search experts concerned with making their information or their name appear number one on a results list, good search was anything that produced a top spot in a result list even if that result was stupid, irrelevant, or shameless ego jockeying. Then there are the chipper, super confident experts who emerged from an educational system which awarded those who showed up and sort of behaved a blue ribbon. Yep, everything that group does is just wonderful. Yeah, right.

You can see the consequences of two forces colliding when you read Science Magazine’s “They Redesigned PubMed, a Beloved Website. It Hasn’t Gone Over Well.”

You can work through the examples in the source article. The pain points range from appearance to search functionality.

Why did this happen?

The change is a result of people who do not have the experience of performing search under stressful conditions. No, I don’t mean locating the Cuba Libre restaurant in Washington, DC, on a Google Map. I mean looking up technical information to complete a lab test, perform a diagnosis, locate a procedure, or some similar action. There is a pandemic going on, isn’t there?

The complaints indicate that the “new” PubMed is not perceived as a home run.

Go read the original.

I want to offer several observations:

  1. Those who do research with intent need predictability; that is, when a Boolean query is entered, the results should reflect that logic. Modern systems think Boolean is stupid. There you go, a value judgment from those with “Also Participated” ribbons in high school.
  2. Interfaces should allow the user to select an approach. There are some users who like a blinking dot or a question mark. Enter the commands and get a text output. Others like the Endeca style training wheels, although I doubt if any of the modern “helper” interfaces know what Endeca offered. Other may want some other type of interface like a PhD approach; that is, push here, dummy. The point is: Why not allow the user to select the interface?
  3. Change is introduced for dark purposes. Catalina has many points of friction so that Apple can extend its span of control. Annoying? Sure is. Why doesn’t Apple tell the truth about these friction points? What? Tell the truth, are you crazy. Apple, like Facebook and Google, are doing what they can to protect their hegemony, and the user is the victim. Tough. The same logic applies to PubMed. Dollars to donuts there is a “reason” for the change, and it may be due to whimsy, money, or the need to demonstrate the team is actually doing something instead of just having meetings with contractors.

Net net: Search, as I wrote for Barbara Quint in the now departed magazine Searcher, search is dead. Each day the hope for a better, more appropriate way to locate online information becomes lost in the mists of time. Getting relevant information from PubMed or any modern systems is like trying to get the stone of Ollantaytambo to explain how the rocks moved eons ago.

Finding information today is more difficult than at any other time in my professional career. That’s a big problem.

Stephen E Arnold, May 24, 2020

The Bulldozer and Bray: Amazon and Its People Policies in Action

May 4, 2020

I read “Bye, Amazon.” The author is Tim Bray. Some may remember him as one of the spark plugs of Open Text. He did some nifty visualization work. He did the Google thing until 2014. From 2014 until a couple of days ago he worked at Amazon, the Bezos bulldozer, the online bookstore, and all-around economic engine of Covid America.

The write up states:

I quit in dismay at Amazon firing whistleblowers who were making noise about warehouse employees frightened of Covid-19.

When Amazon terminated with prejudice the Amazonians protesting.

Mr. Bray’s reaction was

Snap!

Mr. Bray was upset, went through Amazon channels, and resigned.

He states about the warehouse worker action:

It’s not just workers who are upset. Here are Attorneys-general from 14 states speaking out. Here’s the New York State Attorney-general with more detailed complaints. Here’s Amazon losing in French courts, twice.

On the other hand, he points out:

Amazon Web Services (the “Cloud Computing” arm of the company), where I worked, is a different story. It treats its workers humanely, strives for work/life balance, struggles to move the diversity needle (and mostly fails, but so does everyone else), and is by and large an ethical organization. I genuinely admire its leadership.

In his penultimate paragraph he offers:

At the end of the day, it’s all about power balances. The warehouse workers are weak and getting weaker, what with mass unemployment and (in the US) job-linked health insurance. So they’re gonna get treated like crap, because capitalism. Any plausible solution has to start with increasing their collective strength.

Several observations:

  • Mr. Bray has a moral compass. DarkCyber finds that of value.
  • Amazon’s “power” has been largely unchecked since the mid 1990s, and only now are actions building like storm clouds on the horizon.
  • Mr. Bray was able to continue working for the Google but he could not continue working at Amazon. That’s interesting in itself.

Net net: Will Amazon take steps to deal with what seems to be the Tim Bray situation? Do Prime customers get orders delivered on time? Not if warehouse employees put sand in the Bezos bulldozer’s differential.

Stephen E Arnold, May 4, 2020

Google: The Laser That Threatened James Bond Creeps Closer to the Private Parts of the GOOG

April 23, 2020

Update: I omitted the link to the actual Googler blog post. Too excited thinking about “integrity.” My bad.

Goldfinger was an interesting film. In 1965, lasers were advanced. Some thought they were death rays. The Hollywood people, sunning around the pool with Technicolor drinks, thought the laser was the ideal way to burn James Bond’s private parts. Goldfinger was the bad actor. Now Google’s integrity weapon may be threatening Alphabet’s private parts. Odd job indeed.

image

The laser posed a risk to the fictional James Bond’s private parts. The Google integrity verification is a similar risk with one difference: Googlers are steering the destructive beam of actual data toward Alphabet’s secret places.

Flash forward to 2020, “Google to Require All Advertisers to Pass Identity Verification Process.” The word “all” is probably not warranted, but it sounds good. Talking heads enjoy glittering generalities and categorical affirmatives.

Nevertheless, the news story, if accurate, reveals some interesting quasi-factoids. Here’s one example:

Google began requiring political advertisers wanting to run election ads on its platform to verify their identity back in 2018. Now, that program is being extended to all advertisers, the company wrote in a blog post this morning from John Canfield, its director of product management for ads integrity.  The change will allow consumers to see who’s running an ad and which country they’re located in when they click “Why this ad?” on a placement.

Advertisers have to “prove” something other than having a mechanism to put funds into a Google advertising account. Second, Google has a job description which includes these words: “Management” and “integrity.” Plus, the information will not help Google. Nope, the winners in knowing who allegedly buys ads is “consumers.”

Google’s integrity person allegedly said:

“This change will make it easier for people to understand who the advertiser is behind the ads they see from Google and help them make more informed decisions when using our advertising Controls,” John Canfield, Google’s director of product management for ads integrity, said in the post. “It will also help support the health of the digital advertising ecosystem by detecting bad actors and limiting their attempts to misrepresent themselves.”

How does one become verified by Google’s integrity people?

Organizations are required to submit personal legal information (like a W9 or IRS document showing the organization’s name, address and employer identification number). An individual from the organization also needs to provide legal identification on the organization’s behalf. Individuals have to show government-issued photo ID like a passport or ID card. Google said it previously had collected basic information about the advertiser but didn’t require documentation to verify.

How effective are Google’s efforts to filter, screen, and verify? We know that human traffickers and others in this line of business have infiltrated videos on YouTube. We know that one can run a query for “Photoshop crakz”:

image

Apparently Google’s system cannot block listings for stolen commercial software. In fact, the listing for this illegal offering was updated three days ago. DarkCyber knows that some legitimate sites’ content has not been updated for longer periods of time. Notice how Google’s smart autocorrect changed “crakz” into “cracked.” Helpful smart software. Why does Google display the result? Why doesn’t Adobe email Google’s search wizards to have these links with illegal intent filtered? One reason may be that Adobe has emailed Google customer support and is, like many others with questions for the Google, waiting for a response from an informed Googler?

Read more

Google, Ad Transparency, and Query Relaxation: Should Advertisers Care? Probably

April 20, 2020

You need information about Banjo, a low profile outfit in Utah. Navigate to Google and enter the query Banjo law enforcement. No quotes for this query. Banjo has a Web site, and the phrase law enforcement is reasonably common and specific. (It is what is known as a bound phrase like White House or stock market; that is, the two words go together in US English.)

Here’s what the system displayed to me on April 20, 2020, at 0918 am US Eastern time:

image

The search results are okay. The ads do not match the query or the user’s intent: Law enforcement is not even close to a $1,000 musical instrument in a retail store.

Notice that the first result is to a Salt Lake Tribune article in March 2020 about Banjo’s allegedly “massive surveillance system.” The second result is from the same newspaper which reports a few days later that the Salt Lake City police won’t share data with Banjo. So far so good. Google is delivering timely, relevant results.

But look at the ads. The query Banjo law enforcement displays to a person wanting information about a policeware company the following for fee, pay to be seen ads in front of a buyer with an interest in Banjo:

image

These advertisers are betting money that Google can get them relevant clicks when a person search for a banjo. Maybe? But when someone searches for the policeware company Banjo, the advertiser is going to be “surprised.” Do advertisers like surprises?

Here are the advertisers whose for fee ads for people interested in law enforcement software (policeware) had displayed in front of a Google user with a vanishingly low probability of purchasing a stringed instrument whilst researching a specialist software vendor selling almost exclusively to police and screened quangos (quasi non governmental organizations):

  • Banjo Ben Clark
  • Deering Banjo Company
  • Banjo.com (note that our Banjo is Banjo.co)
  • Banjo Studio
  • Instrument Alley
  • Sweetwater
  • Guitar Center

These companies paid for ads as a result of query relaxation. Google’s system does not differentiate the Banjo policeware outfit from the music products.

image

Are there parallels between games in which a person can win money by guessing which cup hides the ball? These games of chance are often confidence operations. In this context confidence means trickery, not trust.

Why? There are url distinctions; that is, Banjo.co versus Banjo.com; there are disambiguation clues in Banjo.co’s Web page; there is the metadata itself with the keyword surveillance a likely index term.

Read more

WFH WTF: A Reality Check for Newbies

March 30, 2020

On Sunday, my son who provides specialized services to the US government and I were talking about WFH or work from home. WFH is now the principal way many people earn money. My son asked me, “When did you start working from home?” He should have remembered, since he was a much younger version of his present technology consulting self.

The year was 1991 (nearly three decades, 29 years to be exact and I am now 76), and I had just avoided corporate RIFFing after an investment bank purchased the firm at which I served as a reasonably high ranking officer. I pitched a multi year consulting deal with the new owners (money people), and I decided that commuting among my home in Kentucky, the Big Apple, and Plastic Fantastic (Silicon Valley) was not for me.

I figured I had a few years of guaranteed income so I would avoid running out an leasing an office. No one who hires me cares whether they ever see me. I do special work; I don’t go to meetings; I don’t hang out at the squash club or golf course; and I don’t want people around me every day. In Plastic Fantastic, I requested an inside office. The company moved the fax machine, photocopier, and supply cabinet to my outside office with lots of windows. I took the dark, stuffy, and inhospitable inside office. Perfect it was.

image

The seven deadly sins of working from home: [1] Waiting for the phone to ring or email to arrive, [2] eating, [3] laziness, [4] anger, [5]  envy, [6] philandering online or IRL, [7] greed. For the modern world I would add social media, online diversions, and fiddling with gizmos.

Why is this important for the WFH crowd?

The Internet is stuffed with articles like these:

The WFH articles I scanned — reading them was alternately amusing and painful — shared a common thread. None of them told the truth about WFH.

My son suggested, “Why not write up what’s really needed to make WFH pay off?” Okay, Erik, here’s the scoop. (By the way, he has implemented most of these behaviors as his technology consulting business has surged and his entrepreneurial ventures flourished. That’s what’s called “living proof” or it used to be before Plastic Fantastic speech took over discourse.)

Discipline. Discipline. Then Discipline Again

The idea is that one has to establish goals, work routines, and priorities. The effort is entirely mental. For nearly 30 years, I follow a disciplined routine. I am at my desk (hidden in a dark, damp basement) working on tasks. Yep, seven days a week, 10 hours a day unless I am sick, on a much loathed business trip, or in a meeting somewhere, not in my home office). Sound like fun? For me, it is, and discipline is not something to talk about in marketing oriented click bait articles. Discipline is what one manifests.

Read more

Semantic Sci-Fi: Search Is Great

March 23, 2020

I read “Keyword Search is DEAD; Semantic Search Is Smart.” I assume the folks at Medium consider each article, weigh its value, and then release only the highest value content.

comic

Semantic search is better than any other type of search in the galaxy.

Let’s assume that the write up is correct and keyword search is dead. Further, we shall ignore the syntax of SQL queries, the dependence of policeware and intelware systems on users’ looking for named entities, and overlook the interaction of people using an automobile’s navigation service by saying, “Home.” These are examples of keyword search, and I decided to give a few examples, skipping how keyword search functions in desktop search, chemical structure systems, medical research, and good old, bandwidth trimming YouTube.

Okay, what’s the write up say beyond “keyword search is dead.”

Here are some points I extracted as I worked my way through the write up. I required more than three minutes (the Medium estimate) because my blood pressure was spiking, and I was hyper ventilating.

Factoid 1 from the write up :

If you do semantic search, you can get all information as per your intent.

What’s with this “all.” Content domains, no matter what the clueless believe, are incomplete. There is no “all” when it comes online information which is indexed.

Factoid 2 from the write up:

semantic search seeks to understand natural language the way a human would.

Yep, natural language queries are possible within certain types of content domains. However, the systems I have worked with and have an opportunity to use in controlled situations exhibit a number of persistent problems. These range from computational constraints. One system could support four simultaneous users on a corpus of fewer than 100,000 text documents. Others simply output “good enough” results. Not surprisingly when a physician needs an antitoxin to save a child’s life, keywords work better than “good enough” in my experience. NLP has been getting better, but the idea that systems can integrate widely different data which may be incomplete, incorrect, or stale and return a useful output is a big hurdle. So far no one has gotten over it on a consistent, affordable basis. Short cuts to reduce index look ups can be packaged as semantics and NLP but mostly these are clever ways to improve “efficiency.” Understanding sometimes. Precision and recall? Not yet.

Read more

Next Page »

  • Archives

  • Recent Posts

  • Meta