Visual Search Engines Provide Different POV Than the Google

July 15, 2020

Google image search is the standard visual search tool people use. It does not, however, provide the extra kick needed for deeper dives, especially with all the Pinterest results. Tech Funnel addresses how visual search engines are an advantage for businesses as well as points out nine great ones in: “Popular 9 Visual Search Engines To Know.”

There are many benefits to using visual search, such as it that it connects with younger generations because they connect with images when they use social media and apps. They are far more likely to purchase an item through these platforms than a Web site. Visual search also allows people to emotionally connect with a brand than standard text and it boosts revenue as it will be the next way people search for items along with voice search.

Popular visual search engines include Pinterest Lens that allows users to take photos of items and they can find, save, or shop for them. Fashion retailers are already using it, so Pinterest users can find clothing their models wear. Google Lens is similar to Pinterest Lens, except its applications are more diverse. It can be used for translation, searching for items, places, people, etc.

Amazon Rekognitio, Instagram Shopping, Snapchat Camera Search, and eBay powered by Cassini search engine have visual search engines dedicated to searching and locating items from photos. They each have different aspects, but all perform the same function. Bing appears to be different:

“From the viewpoint of a user, the experience gotten from Bing Visual Search is similar to other various visual search platforms. However, its feature of an extensive developer platform makes it preferable by a lot of developers.

With Bing Visual Search, developers are enabled to instruct the search engine on the particular data people can get from a specific photo. This means that if Bing Visual Search directs an individual to a certain product on your website, the developer has the ability to determine what information should be provided to the visitor.”

CamFind and EasyJet are the most original engines, because they are not associated with shopping nor Google. CamFind is the first successful mobile visual engine that uses image detection. EasyJet allows people to book flights based off photos, so now you can finally discover where you screen wallpaper is located.

Whitney Grace, July 15, 2020

Search History: Mostly Forgotten and Definitely of Zero Interest to the Smart Software Crowd

July 10, 2020

There’s an interesting, if selective, write up about online information search and retrieval. Navigate to “The Bourne Collection: Online Search Is Older Than You Think.”

An interesting statement appears in the write up:

Founder Roger Summit had been part of Lockheed Missiles and Space Corporation’s mid-1960s Information Sciences Laboratory (1964). He had built his ideas about iterative search—a “dialog” between the user and the computer—into a separate online search division for Lockheed. (This was very different from the “take your best shot” approach of modern search engines, where you generally need to run a new search to refine irrelevant results). Dialog licensed access to leading databases in a variety of fields, which you could search with its powerful tools. While the overall amount of information was far smaller than on the modern web, it was far, far more relevant and better organized.

For the modern online experts, such a quaint, irrelevant, and inefficient concept.

Stephen E Arnold, July 10, 2020

The Myth of Data Federation: Not a New Problem, Not One Easily Solved

July 8, 2020

I read “A Plan to Make Police Data Open Source Started on Reddit.” The main point of this particular article is:

The Police Data Accessibility Project aims to request, download, clean, and standardize public records that right now are overly difficult to find.

Interesting, but I interpreted the Silicon Valley centric write up differently. If you are a marketer of systems which purport to normalize disparate types of data, aggregate them, federate indexes, and make the data accessible, analyzable, retrievable, and bang on dead simple — stop reading now. I don’t want to deal with squeals from vendors about their superior systems.

For the individual reading this sentence, a word of advice. Fasten your seat belt.

Some points to consider when reading the article cited above, listening to a Vimeo “insider” sales pitch, or just doing techno babble with your Spin class pals:

  1. Dealing with disparate data requires time and money as well as NOT ONE but multiple software tools.
  2. Even with a well resourced and technologically adept staff, exceptions require attention. A failure to deal with the stuff in the Exceptions folder can skew the outputs of some Fancy Dan analytic systems. Example: How about that Detroit facial recognition system? Nifty, eh?
  3. The flows of real time data are a big problem — are you ready for this — a challenge to the Facebooks, Googles, and Microsofts of the world. The reason is that the volume of data and CHANGES TO THOSE ALREADY PROCESSED ITEMS OF INFORMATION is a very, very tough problem. No, faster processors, bigger pipes, and zippy SSDs won’t do the job. The trouble lies within, the intradevice and intra software module flow. The fix is to sample, and sampling increases the risk of inaccuracies. Example: Remember Detroit’s facial recognition accuracy. The arrested individual may share some impressions with you.
  4. The baloney about “all” data or “any” type is crazy talk. When one deals with more than 18,000 police forces in the US, outputs from surveillance devices from different vendors, and the geodumps of individuals and their ad tracking beacons — this is going to be mashed up and made usable. Noble idea. There are many noble ideas.

Why am I taking the time to repeat what anyone with experience in large scale data normalization and analysis knows?

Baloney can be thinly sliced, smeared with gochujang, and served on Delft plates. Know what? Still baloney.

Gobble this:

Still, data is an important piece of understanding what law enforcement looks like in the US now, and what it could look like in the future. And making that information more accessible, and the stories people tell about policing more transparent, is a first step.

But the killer assumption is that the humans involved don’t make errors, systems remain online, and file formats are forever.

That baloney. It really is incredible. Just not what you think.

Stephen E Arnold, July 8, 2020

Search for Shopping: Still Room for Improvement

July 7, 2020

Targeted advertising is not the only way retailers can leverage all that personal data users have been forking over. Retail Times reports, “Findlogic Announces the Launch of AI-Powered Virtual Shopping Assistant, Lisa.” Lisa, huh? I guess Findlogic pays no heed to concerns around “female” virtual assistants. That tangent aside, the AI-powered tool is meant to reduce frustration for online shoppers and, in turn, facilitate to more completed sales. Writer Fiona Briggs tells us:

“Lisa returns on-site search based on an individual shopper’s buying intent signals in the context of a broad set of learnt user behaviors. This allows the solution to personalize results for each shopper, delivering more accurate search returns that connect customers to a desired product faster, moving them along the sales funnel and increasing conversion rates. By intelligently applying understanding to on-site search, Lisa helps shoppers better navigate product category or brand searches, which means that, rather than returning hundreds of options, the solution uses skills to refine results to bring shoppers to the exact product they are looking for quicker. Lisa also incorporates machine learning capabilities which allow it to learn and understand a shopper’s preferences and apply them to search, offering up personalized recommendations, which ranks the products the shopper is most likely to choose at the top of the list of results. Lisa also offers up intelligent ways to refine searches for generic keywords, using the application of a skill that then asks the user a set of questions to progress their search based on their individual requirements.”

Findlogic’s UK director emphasizes companies that put effort into getting shoppers to their websites in the first place are let down by traditional, keyword-based search systems that frustrate some 41% of potential customers. The company is betting this AI that can understand “intent” will change that. Based in Salzburg, Austria, Findlogic was founded in 2008.

Cynthia Murrell, July 7, 2020

Stupid Enterprise Search Promotions

July 6, 2020

Check out these incredibly silly pitches for the same market study about enterprise search:

image

This is an example of search engine optimization gaming the Google Alert system. Ridiculous SEO play and a ridiculous report.

The offending company appears to be:

Advance Market Analytics

Shameful.

Stephen E Arnold, July 6, 2020

Algolia Pricing

July 3, 2020

Years ago I listened to a wizard from Verity explain that a query should cost the user per cell. Now that struck me as a really stupid idea. Data sets were getting larger. The larger the data set, even extremely well crafted narrow queries would “touch” more cells. In a world of real time queries and stream processing, the result of the per cell model would be more than just interesting, it would be a deal breaker.

Pricing digital anything has been difficult. In the good old days of the late 1970s and early 1980s, one paid in many different ways — within the same system. The best example of this was the AT&T/British Telecom approach to online data.

Here’s what was involved. I am 77 and working from memory:

  1. Installation, set up, or preparation fee. This was dependent of factors such as location, distance from a node, etc.
  2. Base rate; that is, what one paid simply to be connected. This could be an upfront fee or calculated on some measurement which was intentionally almost impossible to audit or verify.
  3. Service required. Today this would be called bandwidth or connect time. The definition was slippery, but it was a way for the telcos of that era to add a fee.

If a connection went to a data center housing data, then other fees would kick in; for example:

  1. Hourly fee billed fractionally for the connect time to the database
  2. Per item fee when extracting data from the database
  3. A “print” or “type” fee which applied to the format of the data extracted
  4. A “report” fee because reports required cost recovery for the pre-coded template, query time, formatting, and outputting.

There were other fees, but the most fascinating one was the “threshold fee.” The idea is that paid for 60 minutes of connect time. When the 61st minute was required, the threshold was crossed, and the billing could go up, often by factors of 2X or more. No warning, of course. And the mechanism for calculating threshold fees were not disclosed to the normal customer. (After I became a contractor to Bell Communications Research, I learned that the threshold fees were determined based on “outside” or exogenous factors. In Bell Head speak this seemed to mean, “This is where we make even more money.”

To sum up, online pricing was a remarkable swamp. Little wonder that outsiders would be baffled at the online invoices generated by the online providers. Exciting, yes. Happy customers, nah. No one at the AT&T/British Telecom type outfits cared about non Bell Heads. No Young Pioneer T shirt? Ho, ho, ho. Pay your bill or we kill your account. Ho ho ho.

Algolia announced a new pricing plan. You can read about it here. The idea is to reduce confusion and be more “customer friendly.” What’s interesting to me is the string of comments on the Hacker News site. You can read these comments at this link.

There’s some back and forth with Algolia participating.

Some of the comments underscore the type of “surprise” that certain types of pricing models spark; for example, from alooPotato:

We (Streak) are in the same boat. Looks like we’d be paying approx half a million dollars a month on their new pricing which would be ~100x more than we are paying now. Haven’t heard from our enterprise rep but starting to get nervous… Sounds like the new pricing is for their ecommerce customers given how much value they provide them, doesn’t seem to make sense anymore for SaaS use cases.

ysavir takes a balanced view; that is, some good, some bad:

Not the GP, but I figure their point is as follows: If I’m running an e-commerce website, I don’t mind pay-per-search since those searches may turn into sales, so the cost is justified. My income scales with search count, and the Algolia price is part of user acquisition costs. If I’m running a SaaS business, the search is a feature for customers who have already paid, so I don’t see any further returns from the search being used. The more a client uses search, the less I’m profiting from having them as a client. They could potentially even cost me money to service them!

The point is that any pricing model — whether the AT&T/British Telecom type pricing “simplification” or a made-up, wacko approach like the IBM J1, J2, J3, etc. approach — is not going to meet the requirements of every customer.

The modern approach to pricing is to obfuscate and generate opaque variable prices. You can see this model in action by navigating to Amazon and running a query for “mens golf shirt and then zipping over to AWS and check out the prices for Sagemaker models to drive Athena. Got the difference, gentle reader?

The nifty world of enterprise search has been a wonderland of pricing methods. I flipped through the pricing data files for the three editions of the Enterprise Search Report which I began writing in 2002. Here are some highlights:

  • Base fee plus engineering services. Upgrades priced individually.
  • Base fee plus fixed price over a period of time.
  • Variable elements like the crazy “per cell” idea from the guy who is now the head of Google Search (Oh, yeah!)
  • Free if the customer (the US government) licensed other software
  • One time charge. Upgrades are easy. Buy another license.
  • Free. The vendor is in the business of selling engineering support, training, and custom widgets to make the search system sort of work.
  • Whatever can be billed. This is extremely popular because the negotiation process reveals the allocated funds and the search system vendor angles to get as much of the allocated cash as humanly possible.
  • Free for the first budget cycle. Then when funds become available, prices are negotiated.
  • Custom quote only. NDA required.

Today, life is easier. One can download a free and open source search system, hit the local university for some “interns”, and let ‘er rip. Another alternative is to look for a hosted search service. Blossom.com maybe?

Net net: Pricing has one goal: Generate revenue and lock in for the vendor. That’s one reason why vendors of what I can search centric services are so darned lovable.

Stephen E Arnold, July 3, 2020

Neeva: To the Rescue?

July 2, 2020

After the 2017 scandal involving YouTube ads, Google’s head of advertising left the company. However, Sridhar Ramaswamy was not finished with search; he promised then to find another way that did not depend on ads. Now we learn subscription service Neeva is that promised approach from Ars Technica’s article, “Search Engine Startup Asks Users to Be the Customer, not the Product.” Not only does paying to search through Neeva allow one to avoid ads, the platform vows to respect user privacy, as well.

There are just a couple, fundamental problems. First, will enough users actually pay to search when they are used to Googling for free? Critics suspect most users will opt to accept ads over paying a fee. As for the privacy promise, we already have (ad-supported) privacy-centric search platforms DuckDuckGo and Startpage. Besides, though Neeva’s “Digital Bill of Rights” that dominates the company’s About page sounds nice, the official Privacy Policy linked in the site’s footers prompts doubt. Reporter Jim Salter writes:

“Neeva opens that section by saying it does not share, disclose, or sell your personal information with third parties ‘outside of the necessary cases below’—but those necessary cases include ‘Affiliates,’ with the very brusque statement that Neeva ‘may share personal information with our affiliated companies.’ Although the subsections on both Service Providers and Advertising Partners are hedged with usage limitations, there are no such limits given for data shared with ‘Affiliates.’ The document also provides no concrete definition of who the term ‘Affiliates’ might refer to, or in what context.

We noted:

“More security-conscious users should also be aware of Neeva’s Data Retention policy, which simply states ‘we store the personal information we receive as described in this Privacy Policy for as long as you use our Services or as necessary to fulfill the purposes for which it was collected… [including pursuit of] legitimate business purposes.’ Given that the data collection may include direct connection to a user’s primary Google or Microsoft email account, this might amount to a truly unsettling volume of personal data—data that is now vulnerable to compromise of Neeva’s services, as well as use or sale (particularly in the case of acquisition or merger) by Neeva itself.”

Neeva is currently in beta testing, but anyone still interested can sign up to be an early tester on waitlist at the bottom of this blog post. Though Neeva has yet to set a price for its subscription, we’re told it should be under $10 per month.

Cynthia Murrell, July 2, 2020

Google and Winston: Confusing Relationship for Sure

July 2, 2020

Computer glitches happen, even at large companies like Google. The timing of this one, though, looks a little suspicious. The Belfast Telegraph reports, “Google Says Churchill Image Missing Because of Bug in System.” The problem occurred just as the former prime minister’s statue was being walled away to protect it from protesters. Writer Martyn Landi explains:

“Winston Churchill’s image briefly disappeared from Google search results because it was being updated to be more representative of the former prime minister, the tech giant has said. However, that update had been delayed by a bug in Google’s system, the firm said in a statement. It comes after some users complained that Churchill’s image was not appearing in search results for UK prime ministers, although his name was still listed. Culture Secretary Oliver Dowden was among those to express ‘concern’ and said he had spoken to the tech giant over the incident, which occurred during the ongoing debate about Churchill’s statue in Parliament Square, which was boarded up last week.”

Despite the timing, Google insists the snafu had nothing to do with the statue, protestors, or the former prime minister’s alleged racism. The search platform’s Knowledge Graph had been pulling a picture of Churchill from his younger days, which is not the iconic image most of us are familiar with. Googley humans blocked that image from the algorithm, forcing it to choose another one. Between those steps, however, the mysterious “bug” halted the update. Users searching for Churchill received only portrait-free text descriptions. The company stated:

“As a result, Churchill’s entry lacked an image from late April until this weekend, when the issue was brought to our attention and resolved soon after. We apologize again for concerns caused by this issue with Sir Winston Churchill’s Knowledge Graph image. We will be working to address the underlying cause to avoid this type of issue in the future.”

Just what this bug entailed is not revealed. Sounds like a “dog ate my homework” response to us.

Cynthia Murrell, July 2, 2020

Search for the Young in Heart and Mind

June 25, 2020

An expert in search at an outfit called Blue Fountain Media understands search and retrieval. The insights about matching a user query with relevant content almost caused me to weep. See and experience the impact of high quality information yourself by navigating to “Using AI-Powered Visual Search to Enhance CX.” Note: I had to click past warnings about possible malware, complete two captchas about vehicles, and put up with a weird mobile oriented output in order to access this magnificent research paper.

image

Was the effort worth the intellectual reward? Sure, like a Twinkie for a starving college student addled by a long weekend of social distancing in Florida.

The main point of the write up is:

The old saying “a picture is worth a thousand words” takes on all new relevance when we consider the next big feature coming over the ecommerce horizon: AI-powered visual search.

That’s insightful. Like TV, right?

The write up states that AI has some other benefits:

  • The missing agent has been artificial intelligence which has become increasingly more adept at image recognition.
  • Beyond image recognition, AI makes it possible to learn a person’s taste and style preferences.

And there’s a stunning revelation about “most shoppers”. Sure, this is a generalization, but what search expert wants to deal with verifiable data:

Let’s face it, most shoppers know what they want – when they see it – but find themselves completely dumbfounded when it comes to describing what they’re after using the traditional text-driven search engine.

What about a student wanting a copy of Jacques Ellul’s Technological Bluff. Would a visual search be feasible on a mobile device while social distancing on a Zoom video call?

Of course, of course.

Plus, the search expert reveals some next steps. How does this match up with the wonks trying to improve the precision and recall for stress analysis engineers dealing with the failure of an emergency core cooling system valve?

The key, of course, is to mark up these images appropriately. Ironically, yes, this will still include keywords. But if you follow Googles [sic] best practices for images, you’ll find values such as placement and metadata are increasingly more important than simple text descriptors.

A diagram is okay, but the issue is calculating engineering stresses with a dash of math amongst the photos.

Amazing what one learns about search and retrieval when one consults true experts. And those engineers struggling with the analytic job. Check out Google images. It’s a best practice, particularly when the valve is part of the reactor on a nuclear submarine. There’s nothing quite like Google metadata. Are search engine optimization professionals who are also search experts up to task. Sure.

Stephen E Arnold, June 25, 2020

Crazy Enterprise Search Report: Content Marketing Spam Gets Religious

June 23, 2020

DarkCyber noted this content marketing spam dutifully recycled by Jewish Market Reports:

Enterprise Search Software Market Emerging Industry Trends and Dynamics with Prominent Players as Algolia, Amazon, Coveo Solutions, Elasticsearch, IBM, iManage, Lucidworks, Microsoft, SearchUnify, Swiftype

And the author. Maybe a nice Jewish fellow named Sameer Joshi or maybe just a pseudonym?

The story recycles a bit of fluff from the Goodwill of off base data. Goodwill accepts almost any product; the data off shoot is okay with crazed generalizations of mostly off base numbers. That Excel projection function is a darned useful thing too.

The write up covers the 360 view of the market. What’s interesting about these recycled and spam centric reports is not their cost. Think thousands. The fascinating bit is the list of companies fueling the rocket ship of enterprise search in the Rona Era; specifically:

  • Algolia
  • com, Inc. [sic]
  • Coveo Solutions Inc.
  • Elasticsearch B.V.
  • IBM Corporation
  • iManage LLC
  • Lucidworks, Inc.
  • Microsoft Corporation
  • SearchUnify (Grazitti Interactive Inc.)
  • Swiftype, Inc.

A couple of observations. The list is alphabetized, a useful operation. But the nifty part are com, Inc. [sic] and Grazitti. To be blunt, neither outfit is in the DarkCyber/Beyond Search files.

For a nice Jewish boy or maybe not, the list of leaders makes sense. Where was his grandmother when the author demonstrated an inability to determine what was wheat and what was chaff?

Definitely not paying attention because she was working on an earlier version of the document offered by her company, The Insight Partners. More time with Sameer Joshi, her grandson, would have been well spent I surmise. But the publication? Hmmm.

Stephen E Arnold, June 23, 2020

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta