Elasticsearch: A Useful Overview

August 1, 2015

Want to shake free of the proprietary search and retrieval systems? I don’t blame you. Irregular and slow bug fixes and licensing handcuffs are two good reasons. Remember: The cost of search is not the licensing fee. The cost is a collection of fees, purchases, and expenses which every search system with which I am familiar is burdened.

Elasticsearch is the go to solution at this time in my opinion. If you want a useful overview of Elasticsearch, check out the Slideshare presentation “Introduction to ElasticSearch.” You may have to “join” LinkedIn / Slideshare to do anything useful, however.

The deck was prepared / delivered in the spring of 2015 by Roy Russo who is affiliated with or is “DevNexus.” The information is jargon free, an approach which the whiz kids at LucidWorks (Really?) may want to imitate. The presentation does contain a couple of buzzwords like NGram, but no MBA speak.

Stephen E Arnold, August 1, 2015

Endeca: Facets of Novelty

August 1, 2015

I am no specialist in the arcane art of legal eagle spotting. I did notice some references to a dust up between an outfit called Speedtrack and licensees of Endeca’s ageing search technology.

The Speedtrack outfit seems to have rights to an invention called “Method for Accessing Computer Files and Data, Using Linked Categories Assigned to Each Data File Record on Entry of the Data File Record.” This is explained brilliantly in US5544360, filed in February 1995.

Here’s a diagram showing how the user can click on categories to locate information. No typing required.

image

Compare this to Endeca’s invention, “Hierarchical Data Driven Navigation System and Method for Information Retrieval.” This is US7062483, filed in 2001. You may also find US7035864 and US7325201 interesting as well.

image

Federal Circuit Reaffirms Kessler Doctrine As A Patent Infringement Defense For Customers” explains that the Speedtrack infringement case pivots on the Kessler doctrine. Here’s the explanation from the JDSupra.com article:

First, unlike res judicata, which is a defense that is personal to the parties in a prior litigation, the Kessler Doctrine “attaches to the [accused] product itself” and precludes a patentee from reasserting the same patent against the same (or “essentially the same”) product in a subsequent action.

Then noted:

Second, the Federal Circuit ruled that the Kessler doctrine may be raised by customers as well as the product manufacturer or supplier.

What I found fascinating was this infringement related statement attributed to the presiding legal eagle:

Third, the Federal Circuit held that the Kessler doctrine applied to Speedtrack’s claim even though the Endeca software allegedly infringed only when combined with the customer’s own computer hardware.

I recall that Endeca’s faceted navigation burst upon the scene in the late 1990s. Who knew that Jerzy Lewak (co founder of Speedtrack), Slawek Grzechnik, and Jon Matousek seemed to be trying to figure out a way around the problem of keyword search before Endeca?

I wonder if Oracle were surprised too. I have a hunch Speedtrack was.

Stephen E Arnold, August 1, 2015

Bing Is Very Important, I Mean VERY Important

July 31, 2015

The online magazine eWeek published, “What The Bing Search Engine Brings To Microsoft’s Web Strategy” and it explains how Bing spurs a lot of debate:

“Some who don’t like the direction in which Google is going say that Bing is the search engine they prefer, especially since Microsoft has honed Bing’s ability to deliver relevant results. Others, however, look at Bing as one of many products from Microsoft, which is still seen as the “Evil Empire” in some quarters and a search platform that’s incapable of delivering the results that compare favorably with Google. Bing, introduced six years ago in 2009, is still a remarkably controversial product in Microsoft’s lineup. But it’s one that plays an important role in so many of the company’s Internet services.”

Microsoft is ramping up Bing to become a valuable part of its software services, it continues its partnership with Yahoo and Apple, and it will also power AOL’s web advertising and search.  Bing is becoming a more respected search engine, but what does it have to offer?

Bing has many features it is using to entice people to stop using Google.  When searching a person’s name, search results display a bio of the person (only if they are affluent, however).  Bing has a loyalty program, seriously, called Bing Rewards, the more you search on Bing it rewards points that are redeemable for gift cards, movie rentals, and other items.

Bing is already a big component in Microsoft software, including Windows 10 and Office 365.  It serves as the backbone for not only a system search, but searching the entire Internet.  Think Apple’s Spotlight, except for Windows.  It also supports a bevy of useful applications and do not forget about Cortana, which is Microsoft’s answer to Siri.

Bing is very important to Microsoft because of the ad revenue.  It is just a guess, but you can always ask Cortana for the answer.

Whitney Grace, July 31, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

 

Watson: The PR Blitz Continues

July 28, 2015

I know that IBM is trying to reverse 13 quarters of revenue decline. I know that most of the firm’s business units are struggling to hit their numbers. I know that IBM’s loyal employees are doing their best to belt out the IBM song “Ever Onward” in perfect harmony.

image

If you are not familiar with the lyrics, you can read the words at this link on the IBM Web site, which unlike the dev ops pages are still online:

EVER ONWARD — EVER ONWARD!
That’s the spirit that has brought us fame!
We’re big, but bigger we will be
We can’t fail for all can see
That to serve humanity has been our aim!
Our products now are known, in every zone,
Our reputation sparkles like a gem!
We’ve fought our way through — and new
Fields we’re sure to conquer too
For the EVER ONWARD I.B.M.

Goodness, I am tapping my foot just reading the phrase “Our reputation sparkles like a gem!”

And I don’t count the grinches who complain at EndicottAlliance.org like this:

Comment 07/27/15:
Job Title: IT Specialist
Location: Rochester MN
CustAcct: Various
BusUnit: Cloud
Message: I was forced out/bullied out through bad PBC rating/threats of PIP. I left voluntarily a few months back, rather than waiting for the inevitable layoff (since my 2014 rating was a 3, I would have probably been let go with no package). Once I got my appraisal in January, I started looking around and found another job that pays about the same as my band 10 IBM salary – and I am evaluating several other offers as we speak. I truly feel for the victims of yet another round of layoffs. But I don’t quite understand why some find it “shocking” and “unexpected” that IBM gets rid of them. Your CEO has publicly declared that many of you – especially those in the services organizations – are nothing more than “empty calories.” She went on record with those words. What do you expect? Either you organize or you better start looking for something else.

I pay attention to the “3 Lessons IBM’s Watson Can Teach Us about Our Brains’ Biases.” The write up explains:

Cognitive computing is transforming the way we work.

Read more

Image Match: Wave Fingerprints and Search

July 28, 2015

Navigate to “Deep Neural Network Can Match Infrared Facial Images to Those Taken Naturally.” The write up explains that an infrared snap of a person’s face can be matched (mapped) to a normal picture of a human’s face. The idea is that there are wave signatures. I find this interesting. The write up states:

To use such a system for correlating infrared images with natural light counterparts, then, would require a large dataset of both types of images of the same people. The duo discovered that such a dataset existed as part of other research being done at the University Notre Dame. After being given access to it, they “taught” their system to pick out natural light images of people based on half of the infrared images in the dataset they were given. The other half was used to test how well the system worked. The results were not perfect, by any means—the system was able to make correct matches 80 percent of the time (which dropped to just 55 percent when it had only one photo to use), but marks a dramatic improvement in the technology.

The approach has a  number of search related applications. Worth monitoring.

Stephen E Arnold, July 28, 2015

Googles Chauvinistic Job Advertising Delivery

July 28, 2015

I thought we were working to get more women into the tech industry, not fewer. That’s why it was so disappointing to read, “Google Found to Specifically Target Men Over Women When It Comes to High-Paid Job Adverts” at IBTimes. It was a tool dubbed AdFisher, developed by some curious folks at Carnegie Mellon and the International Computer Science Institute, that confirmed the disparity. Knowing that internet-usage tracking determines what ads each of us sees, the researchers wondered whether such “tailored ad experiences” were limiting employment opportunities for half the population. Reporter Alistair Charlton writes:

“AdFisher works by acting as thousands of web users, each taking a carefully chosen route across the internet in such a way that an ad-targeting network like Google Ads will infer certain interests and characteristics from them. The programme then records which adverts are displayed when it later visits a news website that uses Google’s ad network. It can be set to act as a man or woman, then flag any differences in the adverts it is shown.

“Anupam Datta, an associate professor at Carnegie Mellon University, said in the MIT Technology Review: ‘I think our findings suggest that there are parts of the ad ecosystem where kinds of discrimination are beginning to emerge and there is a lack of transparency. This is concerning from a societal standpoint.’”

Indeed it is, good sir. The team has now turned AdFisher’s attention to Microsoft’s Bing; will that search platform prove to be just as chauvinistic? For Google’s part, they say they’re looking into the study’s methodology to “understand its findings.” It remains to be seen what sort of parent the search giant will be; will it simply defend its algorithmic offspring, or demand it mend its ways?

Cynthia Murrell, July 28, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

PageRank: Viewed through the Linear Algebra Sunglasses

July 27, 2015

I urge you to read and work through the examples in “The $25,000,000,000,000 Eigenvector: The Linear Algebra behind Google.” The write up was a tour be force when it became available in 2006. The explains Google PageRank’s ability to display “the good stuff.” The idea is that the system and method set forth in the PageRank patent finds the most “relevant” Web pages matching a query.

I won’t trouble you with references to some work by Dr. Jon Kleinberg for the CLEVER system. I won’t peel back any of the wrappers  which have been layered around the PageRank system after the company went public in 2004.

I will call your attention to a Wall Street Journal article which strikes me as reasonably accurate. The story is “How Google Skewed Search Results.” If you cannot access this document, check out the also acceptable article “Google May Be Hurting Users by Manipulating Search Results, Says Study.”

Fancy math is great in the classroom. Is it possible that good old human intervention works in some situations? And if manipulation does achieve a desired goal, what’s the point of belaboring the fancy math, precedent work, or the subsequent wrapper code?

Stephen E Arnold, July 27, 2015

Instagram’s Search Feature Is A Vast Improvement

July 27, 2015

Instagram apparently knows more about your life than you or your friends.  The new search overhaul comes with new features that reveal more information than you ever expected to get from Instagram. VentureBeat reviews the new search feature and explains how it works: “Hands-On: Instagram’s New Search And Explore Features Are A Massive Improvement.”

Many of the features are self-explanatory, but have improved interactivity and increased the amount of eye candy.

  • Users can Explore Posts, which are random photos from all over Instagram and they can be viewed as a list or thumbnails.
  • The Discover People feature suggests possible people for users to follow. According the article, it dives deep into your personal social network and suggests people you never thought Instagram knew about.
  • Curated Collections offer content based off pre-selected categories that pull photos from users’ uploads.

Trending tags is another new feature:

“Trending Tags is Instagram’s attempt at gauging the platform’s pulse. If you’ve ever wondered what most people on Instagram are posting about, trending tags has the answer. These seemed very random and oddly insightful.”

Instagram is quickly becoming a more popular social media platform than Facebook and Twitter for some people.  Its new search feature makes it more appealing to users and increases information discovery.  Be sure that you will be spending hours on it.

Whitney Grace, July 27, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

SharePoint 2013 Enterprise Search Configuration

July 25, 2015

In just 14 easy steps, you too can configure “SharePoint 2013 for a SharePoint 2013” site. Now this is not enterprise search, but when it comes to Microsoft and information access, trivialities just don’t matter.

The screenshots show what options to select. There is no explanation in Step 4 for what to do if you click “Basic Search Center” instead of “Enterprise Search Center.” A real MSFT lover will know the difference between “basic” and “enterprise” for a SharePoint site.

Follow the clicks to Step 9. Note that under the category search one selects “Search Settings”, not “Search and offline availability.” Again the clarity is astounding.

Cut and paste your way to Step 13 where you configure search navigation. Just click “everything” and presumably the URL, the description, and the link will be locked and loaded. And if not? Well, there will be no errors, gentle reader.

The coup de grace is Step 14. Here’s the instruction which is crystal clear:

Just go and check “Use the same results page settings as my parent” is selected from the subsite search site settings.”

You are good to go—directly to a consulting firm specializing in installing a third party search system into your SharePoint solution. Sorry, but that approach usually works. The Fast Search thing from the mid 1990s? Not exactly flawless in my experience. Configuration files are still nestled deep in the innards but the graphical interface may not get you where you need to be.

Stephen E Arnold, July 25, 2015

Yahoo: A Return to Web Search?

July 24, 2015

I have only a hazy recollection of a conversation with Dave Filo, one of the founders of Yahoo. That was a long time ago. Chris Kitze and I had started The Point, which was a curated list of G-rated Web sites. The telephone call was to discuss what we were doing and what Yahoo was doing. We were doing essentially the same thing, which was okay. We aimed at doing the Good Housekeeping Seal of Approval thing with our Top 5% of the Internet. The Yahooligans were creating a general directory of Internet sites. Our approaches were complementary. We sold to Lycos (CMGI) and Yahoo did its Yahoo thing until today.

I thought about the manually assembled Web directory and the look at the listings approach of Yahoo. We had a lousy search engine along with categories for the Point. I never thought of Yahoo as being a Web search engine. That came later when Yahoo experimented, licensed, bought Inktomi, and ended up with a deal to get a Web search thing from Microsoft.

Imagine how the headline “Yahoo Wants to Return to Its Roots as a Search Engine” created some associative dissonance for me. Yahoo was a list. A manually constructed list of links. Yahoo was a directory first. Search came later and, in my opinion, never arrived. The write up states:

Yahoo wants to be a search giant once more.

Even the azure chip consultants are struggling with this Xoogler vision. I highlighted this gem from the ground level of consulting insight:

However, Gartner analyst Mike McGuire tells Quartz he thinks Yahoo’s renewed focus on search is “a bit quixotic,” questioning its ability to execute and capture market share.

Okay. Yahoo is a weird 1990s thing which is, I suppose, the last portal standing. Search is a bridge too far for many companies. Maybe that’s why there are just a couple of Web search engines that get the bulk of the traffic and an information highway with some smaller outfits which the high speed drivers zoom right by. When was the last time you stopped at Qwant.com or Unbubble.eu?

I understand the enthusiasm for writing something, anything, that seems new and fresh. But Yahoo does not have roots in search. Consequently it, like many other companies, has disappointed with its approach to information access. Nevertheless, the article goes its merry way just like Yahoo. Sympathetic harmonics at work.

Stephen E Arnold, July 24, 2015

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta