eBay and Corrigon: Heading in the Right Direction?

October 14, 2016

I find eBay fascinating. Many things for sale; for example, $3,000 Teddy bears. I wonder what those are.

I read “eBay to Acquire Corrigon Ltd.” Interesting. I learned about Corrigon, an Israel-based image search and analysis outfit, about seven years ago. The company’s technology can “look” at a digital image and recognize objects in the image. Coirrigon’s pitch, as I recall it, introduced me to the concept of “dynamic browsing.” I thought most browsing was, by definition, was dynamic, but why ask questions which marketers cannot or will not answer. The buzzwords are the intellectual food which gives me Delhi belly.

One application of Corrigon’s technology is to identify objects in a photo can create a link to a shopping site where one can purchase that object. For instance, I am looking at this image:

image

The Corrigon system will, in theory, point me to this type of entry on another Web site:

image

What if I really want the model’s shirt? Well, that may be an issue.

Corrigon has some law enforcement and intelligence applications as well. My hunch is that eBay wants to allow a person to see something, buy something.

The method adds layers and performs image parsing. The method is fine but the approach can add compute cycles. Latency when shopping is a bit of brown bread.

The write up informed me that:

Corrigon’s technology and expertise will contribute to eBay’s efforts with image recognition, classification and image enhancements as part of its structured data initiative. There are three parts to eBay’s structured data initiative: first, collect the data; second, process and enrich the data; and third, create product experiences.Corrigon will support the second and third parts – processing and enriching the data and creating product experiences.

Let’s think about how an eBay user accesses information in the digital flea market now. A person navigates to the site and plugs in keywords. The system then generates a bewildering array of options and some listings. A user then scans and clicks the laundry list of listing. Then the user reads individual listings. Then the user presumably buys the best listing. Heaven help the user who needs to hunt for the link to ask the seller a question. Etc. etc. etc.

eBay’s purchase of Corrigon is going to make eBay into a zippier shopping experience. Well, that’s the theory.

eBay’s challenge is my fave Craigslist and obviously the Bezos beastie. I asked myself, “Perhaps eBay should do some interface work and poke around its core search functionality?”

Stephen E Arnold, October 14, 2016

Definitions of Search to Die For. Maybe With?

October 13, 2016

I read “Search Terminology. Web Search, Enterprise Search, Real Time Search, Semantic Search.” I have included glossaries in some of my books about search. I did not realize that I could pluck out four definitions and present them as a stand alone article. Ah, the wonders of content marketing.

If you want to read the definition with which one can die, either for or with, have at it. May I suggest that you consider these questions prior to your perusing the content marketing write up thing:

Web search

  • What’s the method for password protected sites and encrypted sites which exist under current Web technology?
  • What Web search systems build their own indexes and which send a query to multiple search systems and aggregate the results? Does the approach matter?
  • What is the freshness or staleness of Web indexes? Does it matter that one index may be a few minutes “old” and another index several weeks “old”?

Enterprise search

  • How does an enterprise search system deliver internal content points and external content pointers?
  • What is the consequence of an enterprise search user who accesses content which is incomplete or stale?
  • What does the enterprise search system do with third party content such as consultants’ reports which someone in the organization has purchased? Ignore? Re-license? Index the content and worry later?
  • What is the refresh cycle for changed and new content?
  • What is the search function for locating database content or rich media residing on the organization’s systems?

Real time search

  • What is real time? The indexing of content in the millisecond world of Wall Street? Indexing content when machine resources and network bandwidth permit?
  • How does a user determine the latency in the search system because marketers can write “real time” while programmers implement index update options which the search administrator selects?
  • What search system indexes videos in real time? YouTube struggles with 10 minute or longer latency with some videos requiring hours before the index points to those videos?

Semantic search

  • What is the role of human subject matter experts in semantic search?
  • What is the benefit of human-intermediated systems versus person-machine or automated smart indexing?
  • How does one address concept drift as a system “learns” from its indexing of information?
  • What happens to taxonomies, dictionary lists of entities, and other artifacts of concept indexing?
  • What does a system do when encountering documents, audio, and videos in a language different from the language of the majority of a system’s users?

Get the idea that zippy, brief definitions cannot deliver Gatorade to the college football players studying in the dorm the night before a big game?

Stephen E Arnold, October 13, 2016

Funnelback: October Advertising

October 11, 2016

Interesting note. Funnelback, owned by Squiz, is displaying in line, personalized advertising. Today is October 10, 2016. Funnelback’s ad is:

image

Timely. I think about Valentine’s Day in October. Money well spent?

Stephen E Arnold, October 11, 2016

HonkinNews for October 11, 2016 Now Available

October 10, 2016

The most recent HonkinNews video is now available at this link. Stories include Yahoo’s most recent adventure: A purple light Y-Mart discount of $1 billion dollars on the Verizon purchase offer. Learn how Google Translate handles a Chinese poem about ospreys, not government administration. Included in the seven minute program is information about IBM Watson in the third grade and Bing’s secret to revenue success. These stories and more like the diffusion of the idea of “good enough” search. Direct from Harrod’s Creek in rural Kentucky… HonkinNews for the week ending October 11, 2016.

Stephen E Arnold, October 10, 2016

Creativity: Implications for Search

October 10, 2016

Computer Scientists Discover 14 key Components of Creativity” tries to reveal what differentiates the creative person from the average individual. Let’s look at the attributes of creativity:

  1. Active involvement and persistence. Yes, this is two attributes packaged as one
  2. Dealing with uncertainty
  3. Domain competence
  4. General intellectual ability
  5. Generation of results
  6. Independence and freedom. Another two’fer.
  7. Intention and emotional involvement. The bundling approach seems semi-creative. Maybe a cop out?
  8. Originality
  9. Progression and development. Again!
  10. Social interaction and communication. Obviously a pattern of creating one “new” idea by sticking two things together.
  11. Spontaneity/subconscious processing. Again two to make one.
  12. Thinking and evaluation. Isn’t evaluating a component of thinking?
  13. Value.
  14. Variety, Divergence as well as Experimentation. Now three attributes combine to produce one attribute. Does this overlap with other components.

Reflecting on this list, my hunch is that a bit more creativity in making the attributes clear might be needed. The list is interesting, but it lack—how shall I phrase it—creativity.

How does this apply to search?

For old school Boolean systems like ip.com, one has to know what one is looking for before the search system is of much help. Thus, the system can only respond to inputs from a human. The more creative the human, the less likely the cut and paste snippets function will be. Other systems with this less than creative approach include other Boolean systems and Lucene.

For modern predictive systems, the creativity shifts from the human to the software. The idea is that the software will look at the user’s history, similar users’ behaviors, GPS coordinates, and other observable information and produce outputs like Alexa or Google mobile search. The human does not have to think. The creativity seems somewhat limited because when one is looking for pizza via a mobile phone, some of the attributes seem less than creative.

Search systems which try to respond to the thoughts and notions of the human user and software delivering results based on rules are elusive.

Creativity may be difficult to generate and deal with. Perhaps that is why the list of 14 attributes includes multiple word descriptions which try to get at a single notion.

Cleverness is not on the list. Why not? I find clever approaches to search more interesting than creative searches. Clever?

Stephen E Arnold, October 10, 2016

More about Good Enough Search

October 10, 2016

I have concluded that finding information is entering a mini Dark Ages. The evidence I have gathered suggests that young folks will speak to their mobile devices to get pizza and information for their PhD research projects. I have a folder of examples of applications of smart software which produces remarkable marketing assertions and black box outputs.

I have added to my collection the write up “Postgress Full Text Search Is Good Enough.” I learned that “good enough” search has these features:

  • Stemming
  • Ranking / Boost Support
  • Multiple languages
  • Fuzzy search for misspelling
  • Accent support

I assume, which is risky, that keywords are part of the basic feature set. But in a world of “good enough”, who knows?

The write up provides code snippets for and details regarding the implementation of Postgress’ search function. The explanation of Postgress’ internal methods may require that you keep some Postgress manuals handy and have a browser pointed at Bing or Google to chase down some of the jargon; for instance:

A tsvector value is a sorted list of distinct lexemes which are words that have been normalized to make different variants of the same word look alike. For example, normalization almost always includes folding upper-case letters to lower-case and often involves removal of suffixes (such as ‘s’, ‘es’ or ‘ing’ in English). This allows searches to find variant forms of the same word without tediously entering all the possible variants.

The section on optimization and indexation provides some useful guidelines. Trouble may result from mismatching one’s data with the types of indices Postgress offers.

If you are using Postgress and interested in “good enough” search, you will find the write up helpful. If you are an entrepreneur and want to tap into an underserved market for a graphical administrative interface for Postgress “good enough” search, you will find that the write provides a checklist for you to follow.

For me in rural Kentucky, I marvel at the happy acceptance of “good enough” search. Once “good enough” takes hold, where does one find the impetus to deliver outstanding search? Do I look to dtSearch? IBM OmniFind (aka, Watson). A whizzy cloud service like Amazon’s?

I suppose I can ask Siri or Cortana. Good enough search and good enough answers. Except when the answers are off point or just incorrect. A “C” is good enough for today’s business and technical approaches.

Stephen E Arnold, October 10, 2016

Bing Finally Turned a Profit

October 7, 2016

Bing is the redheaded stepchild of search engines, but according to the Motley Fool the Microsoft owned search engine started to earn a profit during its last fiscal year.  The Motley Fool shares the story in “Bing Became Profitable Last Year.  Can It Keep Up?” Bing’s search advertising generated $5.5 billion in estimated revenue, which is more than what Twitter and Tencent earned.  Into 2016, Bing continues to turn a profit.

Bing’s revenue grew in Microsoft’s last fiscal year quarter and in June 40% of the search revenue came from Windows 10 devices.  When the free Windows 10 upgrade ends soon and thus will end the growth, as Bing will no longer be see a high adoption rate.  Microsoft will continue to grow Bing and profit is predicted to continue to rise:

One important factor is that Microsoft outsourced its display advertising business at the beginning of fiscal 2016. That has allowed the company to focus its sales team on its search advertisements, which generally carry higher prices and margins than display ads. That makes the sales team more cost-efficient for Microsoft to run while it collects high-margin revenue from outsourcing its display ads.

This means Microsoft will raise its ad prices and will focus on selling more ads to appear with search results.  Bing will never compete with Google’s massive revenue, but it has proven that it is less of a copycat and a stable, profit generating search engine.

Whitney Grace, October 7, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Search: The Dead Cat Bounces

October 6, 2016

I read two articles about the future of search. The first was a series of remarks in a podcast by Christopher Issac Stone, aka Biz. In a nutshell, one finds information by asking people.

The other write up was “If I Ran Google (Why the Future of Search Will Diverge from Its Present and Past.” The author of this article is a “multi time bestselling author.” The “A” was capitalized.

The two views of the future of search underscores the perception that keyword search is dead. Text is uninteresting. Search systems are bouncing like a dead cat; that is, typing words in a search box and looking for germane information is not where the world of users wants to go. Hence, search is going to change.

image Image result for pottery tablets with inspirational messages

Left clay tablet from the 4th millennium BCE. Right. clay tablets from 2016. Not much change it seems.

Here’s a statement which hits at the future of search. The quote comes from the multi time bestselling “Author”:

A lot of younger people don’t use Google as much as we might expect. They find things on YouTube (an Alphabet company), Snapchat, Facebook, Instagram or the like.

I agree. Pizza, cat videos, and even information about the future of search by many people will be sought and found using something other than the digital equivalent of a library card catalog. Thump. That’s the sound of the dead cat bouncing or hitting the pavement.

The thump means Google, the game changer, is going to have the game changed for itself.

The future is actionable intelligence. Ask a question and get an answer. Then order a pizza or watch a living cat video. Dead cats are not interesting.

Several thoughts:

First, there are numerous ways to look for information needed to answer a question. There are search boxes when one presumably is working on a research paper or maybe an article destined for publication. That is the old fashioned work which requires attention, note taking, and thinking about a topic and how to answer questions for which there is no single journal article or reliable data set. This type of research will not appeal to some people.

Second, there is the convenience of asking others for information. This is a useful type of information collection. Sometimes it works, and other times it forces the questioner to drag himself or herself back to the old fashioned method described in item one above.

Third, there is smart software which looks at behaviors and makes a best guess about what the person needs to know. When I drive to the airport, I want my GPS to show me which parking garage has an open space. No typing and no talking, please. Just the map with the answer.

In each of these broad categories of access — typing keywords, asking via text or voice, or smart software making best guesses — useful information can be located.

Most of the folks with whom I interact are not happy with search, a broad term used to describe a remarkable range of information access systems.

The problem with the future is that it is not going to bounce like the dead cat of the present.

If I have learned one thing in my years in the information access sector it is:

Information access methods do not die. Options become available.

Regardless of the future, some reading is necessary. Some talking to humans is necessary. Some smart software inputs are necessary.

Heck, here in Harrod’s Creek, people still use clay tablets to communicate. The message about the future is that “good enough” information access is more important than old fashioned checkpoints like precision, recall, provenance, and understanding.

Stephen E Arnold, October 6, 2016

Reverse Image Searching Is Easier Than You Think

October 6, 2016

One of the newest forms of search is using actual images.  All search engines from Google to Bing to DuckDuckGo have an image search option, where using keywords you can find an image to your specifications.  It seemed to be a thing of the future to use an actual image to power a search, but it has actually been around for a while.  The only problem was that reverse image searching sucked and returned poor results.

Now the technology has improved, but very few people actually know how to use it.  ZDNet explains how to use this search feature in the article, “Reverse Image Searching Made Easy…”. It explains that Google and TinEye are the best way to begin reverse image search. Google has the larger image database, but TinEye has the better photo experts.  TinEye is better because:

TinEye’s results often show a variety of closely related images, because some versions have been edited or adapted. Sometimes you find your searched-for picture is a small part of a larger image, which is very useful: you can switch to searching for the whole thing. TinEye is also good at finding versions of images that haven’t had logos added, which is another step closer to the original.

TinEye does have its disadvantages, such as outdated results and not being able to find them on the Web.  In some cases Google is the better choice as one can search by usage rights.  Browser extensions for image searching are another option.  Lastly if you are a Reddit user, Karma Decay is a useful image search tool and users often post comments on the image’s origin.

The future of image searching is now.

Whitney Grace, October 6, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

HonkinNews for October 4, 2016 Available

October 4, 2016

This week’s HonkinNews is available at this link. The feature story explores Palantir Technologies’ love-less love relationship with the US Army. Palantir’s approach to keeping its government customers happy is innovative. We also comment about Google’s blurring of cow faces in StreetView. Learn why SearchBlox is giving vendors of expensive, proprietary enterprise search systems cramps in their calves. Microsoft continues to pay users to access the Internet via Edge and use Bing to search for information. How much does the US government spend for operations and maintenance of its systems? The figure is surprising, if not shocking. This and more in HonkinNews for October 4, 2016.

Kenny Toth, October 4, 2016

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta