More to Search than Relevancy, Accuracy, Precision and Recall?

October 25, 2013

There is a good chance that we may hear someone cry heresy after reading the Moving Fulcrum post: “The Growing Irrelevance of Google Search.” The author presents his case as a focus group of one who happens to no longer utilize Google as his primary search engine any more. Instead, the author finds information using the following sources: Twitter, Stack Overflow, Wikipedia and Yelp.

The author admits that his searches are focused around himself as opposed to Web sites of information. Whether this has always been the case or not, there are certainly many media available now in order to address the needs of every individual using the Internet.

The post states:

Google excels at searching for the long tail of information. That was true a few years ago, when an individual’s opinion could only be expressed on either a blog post or a forum post, which Google could index/rank like nobody’s business.

But in a world with Twitter, and the silos of information that are now sites like Stack Overflow and Wikipedia, Google Search is becoming more and more irrelevant.

While Google is all about relevance, accuracy, precision and recall, we have to ask the question “is that what people want?” For example, the recent New York Times article “It’s Not Just Political Districts. Our News Is Gerrymandered, too” suggests people might not want to search or see much more than their own reflection.

Megan Feil, October 25, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Personal Search Engine Enters the Scene

October 25, 2013

So you read an article and you know you will want to explore that topic further in the future. What do you do? Bookmark it. Moments later, you find yourself on a great new social analytics search engine. That gets bookmarked too – in addition to many other Web sites and articles that you might want to remember to revisit one day. When that day arrives, it can often be troublesome to locate the specific link you wanted to find amongst all the others you have bookmarked. This is why services like PSE (Personal Search Engine) are incredibly useful. We recently read a helpful write-up on PSE in particular: “PSE Is A Personal Search Engine, Makes Browser Bookmarks Useful Again.”

PSE does not require a download. It is a bookmarklet and it works in any browser except for Internet Explorer. It appears that mobile usage is a potential future update that the developers are exploring.

The article tells us:

“The service is great for articles, but it’s especially good for research or other snippets of information that you know you’ll need later not based on the page title or where you found it, but the actual content of the page you were reading. If you stumble on a site with a great recipe, for example, highlight the recipe and add it to your database. Then later you can search for ‘garlic’ and find it, instead of trying to remember that the recipe was on ‘easycheapweeknightdinners.com.’”

While this service definitely seems like a step in the right direction towards supplementing our invariably fallible memories, it may not be the be-all, end-all of search. What about a personal search engine that truly allows a user to search every folder, email, bookmark across multiple applications, devices and more?

Megan Feil, October 25, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Search Vendor Management: Frazzled and Scared

October 24, 2013

Generalizations are terrible. Generalization can be useful. I read “Why Being a Thinker Means Pocketing Your Smartphone.” The story appeared on the CNN Web site. I find this amusing, since CNN is associated in my mind with content delivery for those with some sort of dependence on TV filtered information. The key point in the write up struck me as:

“You can’t make headway without thinking about a problem for a long time, in collaboration with smart researchers from different fields, as well as reading a lot,” says epidemiologist Caroline Buckee, one of CNN’s 10 Thinkers for 2013. “But sometimes that hard work reaches fruition or comes together at a random time once you have let thoughts settle down.” We know this — as surely as that 20th-century magnate knew it — and yet we regularly ignore the advice. We surf the Web; we scan news on our phones; we keep our minds digitally occupied in a million ways. When we have a few minutes of down time now, we pull out our mobile devices instead of daydreaming.

The statement is only partially correct. Let me narrow the focus to behavior influenced by uncertainty about what actions to take and the insecurity generated by not having a product or service that people want to pay for.,

Think about your last interaction with a vendor of search, content processing, and analytics. How did the interaction flow? I have noticed since the summer vacations ended and management of search vendors focused on making money that two words characterize many behaviors of the senior management of search and content processing companies. The two words?

Frazzled and Scared

What do I mean?

Here are some recent example:

  1. Information promised on a specific date has not been provided six weeks later. The fact that the information was needed for a potential investor adds to the spice of the incident.
  2. A statement “We will meet at the X conference” became three weeks later, “We are traveling outside the United States”
  3. An assurance that customer support would provide an activation key to a search system generated four additional assurances. But no key arrived.

At a recent conference, I noticed:

  1. A vendor who beamed when a colleague and I approached the booth. The vendor launched into a series of questions about budget, decision time, and internal staffing capabilities. When I pointed out that I did analyses for my clients, the vendor turned off the charm and moved to another “fish”
  2. Four vendors in four consecutive presentations said, “We do real time content processing of all information.”
  3. One company president had beads of perspiration on his forehead as he talked on his mobile phone in a corner of the booth. He looked fearful.

So what?

Based on the information in our Overflight system, a number of search and content processing vendors are no longer updating their blogs with regular posts of a substantive nature. The flow of emails about free webinars and new products is on the rise. I received a half dozen on Wednesday, October 23, 2013. For example:

Might you have a few minutes for a call with Mike Schmitt, Senior Director of Product Management for Astute Networks, to discuss the paper and its findings?  It is interesting how even today, smart IT executives are still thinking about storage cost only in terms of the device, vs. the extended consequence it has across performance and productivity, as well as business flexibility and agility.

The “paper” is one of those azure chip, toot toot things. Sigh.

I also am inundated with messages about the “crisis” in search, the lack of traffic to search vendors’ Web sites, and the death of “leads”.

Perhaps the search and content processing companies should step back, take a deep breath, and consider the impact of wild and crazy statements, odd duck behavior at trade shows, and a panhandler’s approach to revenue generation.

Delphes: A Linguistic System That Went Away

October 22, 2013

I have posted a profile of the now offline enterprise search vendor Delphes. You can access the write up at www.xenky.com/vendor-profiles.

Delphes is an illustration of what happens when academic research becomes a commercial search system. From the notion of “soul” to the mind boggling complexity of a Swiss Army knife system, Delphes draws together the threads of the late 1980s and early 1990s best ideas in search. The problem was that selling, supporting, and making the many functions work on time and budget were difficult.

How many other vendors have followed in the trail blazed by Delphes? Quite a few. Some have largely been forgotten like DR Link. Others are still with us, but subsumed into even more complex, over arching systems like Hewlett Packard’s blend of print management and Autonomy.

Reviewing a draft of my analyses of Delphes, several points struck me:

First, Delphes was one of the first search system to combine the almost mystical with the nuts and bolts of finding information in an organization.

Second, Delphes included a number of languages, but it was French language centric. Many search systems are English centric. So the approach of Delphes makes some of the linguistic issues clear.

Third, Delphes’ explanation and diagrams are quite fresh. I have seen similar diagrams in the marketing hoo-hah of many 2013 vendors.

Keep in mind that these profiles will not be updated or maintained. I am providing the information because some students may find the explanations, diagrams, and comments of interest. The information is provided on an “as is” basis. If you want to use this for commercial purposes, please, contact me at seaky2000 at yahoo dot com.

Remember. I am almost 70 years old and some of the final versions of these profiles commanded hefty fees. IDC, for example, charges $3,500 for some of the profiles I have created. Are my views worth this lofty price? In my view, that is an irrelevant question since some vendors in Massachusetts just sell the stuff, keep all the money, and leave the addled goose floating in the pond.

A reader reminded me that some big outfits have taken my work and reused it, sometimes with permission and sometimes not. Well, these are for your personal use. As for the big firms, those managers are just so darned skilled any action they take is admirable. Don’t you agree?

Enjoy the tale of Delphes.

Stephen E Arnold, October 22, 2013

Take the Coke and Pepsi Wars and Insert Search

October 20, 2013

Basic economics tells us that brand rivalry prevents a complete monopoly on a free-based market. The quintessential examples are Pepsi and Coke, but let us make the metaphor more modern with a comparison between Bing and Google. EWeek takes a look at Bing’s new claim that its search is more popular than Google in “Is The Bing It On Challenge A Little Off?” The “Bing It On Challenge” supposedly compared Google and Bing search result side-by-side and stripped of their branding. It showed that users preferred Bing 2:1. Yale professor Ian Ayres found these results questionable, because he found the results to be mostly identical.

When Microsoft was asked to share their data sets, they refused to release their results. Ayres got even more peeved when he found out how they collected their information and decided to run his own test:

“Admitting to being “slightly annoyed” in discovering that the claim was based on a study of a mere 1,000 participants, he said that he enlisted Yale law students to run an experiment using a similar sample size and the BingItOn.com Website. We found that, to the contrary of Microsoft’s claim, 53 percent of subjects preferred Google and 41 percent Bing (6 percent of results were ‘ties’),” reported Ayres. Secondary tests, which involved randomly assigned participants and a mix of popular, Bing-suggested and self-suggested search terms, failed to come close to Bing’s 2:1 advantage.”

Then the claim comes in that the results were not shared because they are not tracked and that results in the challenge were slanted in Bing’s favor. Microsoft burned itself on those two. Basic scientific method research would toss this test in the kindling pile immediately. No results or favoritism at all? One fact about marketing is that advertisements cannot make claims without proof. Oh boy! We are back on the Coke and Pepsi blind-fold taste test. Which search results belong to which search engine?

Whitney Grace, October 20, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Super Search Cooks Dinner and Other Practical Skulls

October 19, 2013

The next time you go to a restaurant and ask to speak with the chef to give him your complements, you just might requesting to speak with IBM’s Watson. According to the MIT Technology Review in, “New Answer From IBM’s Watson: Recipe For Swiss-Thai Fusion Quiche” Watson can now cook. Marked as one of the “light” functions that Watson can perform, inventing recipes is one of the new ways devised to help people on the search and answer discovery.

IBM may have invented the next, best toy robot, but after winning Jeopardy they wanted to put Watson’s AI to more practical uses. While Watson has also been testing its skills in medical applications, the AI has trouble deciphering individual writing styles.

The best way to fix this problem is:

“Watson and other analytic technologies will get better if such records are formatted in clearer ways–with distinct fields for patient symptoms, actions taken, and outcomes, he said.  With this in mind, IBM has been trying to customize business software to be Watson-ready (see “Watson’s New Job: IBM Salesman”).  A larger point was articulated by Thomas Malone, director of the MIT Center for Collective Intelligence. The future, he said, lies in building systems that can best leverage the capabilities of humans and computers.  A growing body of research is finding that answers gleaned from a combination of humans and computers are more accurate than those generated by either group alone, he said.”

Right now, the best way to make Watson learn is to ask him questions based on a series of search parameters such as the recipes. The results may be strange, such as the papaya-cayenne-orange custard it developed, but oddly delectable.

Whitney Grace, October 19, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Yandex App Service: More Evidence Search Is Not Enough

October 17, 2013

I have been sifting through decades of reports about search vendors. As my team and I review versions of reports we have sold to azure chip outfits, slick MBAs, and the generally confused management teams—one stark “truth” emerged. You can follow our free analyses of search vendors on Xenky at www.xenky.com/vendor-profiles.

Search is not enough. Not enough magnetism to generate revenue sufficient to maintain the most complex service in the world. Not enough profit to satisfy even the most patient and deep pocketed mother or venture firm. Not enough sizzle to keep the folks turning up for uninspected chicken. Just not enough. Chasing big money with search may be one of those quests some undertake without broader awareness of the difficulty ahead.

image

The search entrepreneur can triumph over search. A happy quack to http://www.donquijote.cc/.

I read “Search Engine Giant Yandex Launches Cocaine, A Cloud Service To Compete With Google App Engine” and realized that Yandex is more evidence supporting the “truth.” I assume that the story is accurate, but in the world of search and analytics, reality is—shall I say—malleable.

The point of the write up is:

Russian search giant Yandex has launched an open-source platform as a service (PaaS) … that the company says allows developers to build out their own app engines. Yandex, in its documentation, describes [the platform] … as an open-source PaaS system for creating custom cloud-hosting apps that are similar to Google App Engine or Heroku. It supports C++, Python and JavaScript. It is now developing support for Java and Racket.

Observations:

  1. Search—enterprise flavor and public Web flavor–needs help. The help is apps and advertising.
  2. Search by itself is not a driver of growth. If it were, why get into the “more than search” business. Apps are not search. Search is no longer precision and recall. It is a weird fantasy land for some dreamers.
  3. Search does not produce big money. If the big money were available, certain decisions about Yandex’s home country building its own search system and countering Google’s ambitions in Russia would not be a problem. If search were big money, Google would not be an advertising system. Search is just an enabler of other, more lucrative activities.

I am not concerned about Yandex. What interests me is the obviousness of the “truth” that search revenue is a windmill. Language is too slippery. I watched a video on the MIT Video site of a learned panel of search experts. You can give it a whirl at http://goo.gl/tZm0j7. I had to be on my toes. There were some folks charging at search windmills at full gallop.

What’s this Yandex move mean for companies pitching search as the source of limitless revenue?

Interesting question. Some search experts will be saddling up and heading off to attack a windmill. Are there many left in Kentucky? I wager there are some in the Boston area, Silicon Valley, and Moscow.

Stephen E Arnold, October 17, 2013

Blippex for a Different Kind of Search

October 17, 2013

Since Google came to dominate the internet search landscape, many rivals have launched. Some have found varying degrees of success, but none have come close to overtaking the master. Now, blogger Christopher Mims believes he may have found a contender in Blippex; “This Is the First Interesting Search Engine Since Google,” Quartz declares. We also found Blippex interesting.

Mims notes that, unlike most competitors, Blippex is not trying to reinvent the Googly wheel. Its approach is different. Instead of indexing the web in general, Blippex looks only at pages its users have visited. The article explains:

“Blippex’s algorithm, called DwellRank, decides relevance based on how long users spend on a site and how many times Blippex users have visited it. Researchers at the University of Massachusetts Amherst have, independently of the Blippex team, established that the amount of time someone spends on a web page or document is, not surprisingly, a pretty good measure of how important and relevant it is (pdf). Blippex gets this information by having you download a plugin for your web browser. This plugin measures how long you spend on each site and sends the information to Blippex, anonymized—that is, stripped of any information that could identify you.”

Isn’t this approach a bit limiting? For now, yes, but the makers of Blippex liken the young site to Wikipedia, which became much more effective as users contributed information. Currently, says Mims, the site’s user base is mostly geeky early adopters, so it is a good place to go for programming questions. It is also adequate for recent events, he writes, but is not the place for more obscure searches. With the limitations, why bother? Well, Blippex’s “fanatical” commitment to privacy is one reason; like DuckDuckGo, the site does not track its users. They even made their browser plugin open source, so folks can verify that it is not collecting private information. And, of course, the results will get better as more people install that plugin.

There remains one question—how will Blippex make money on this ad-free site? If co-founders Max Kossatz and Gerald Bäck have figured that out yet, they don’t seem to be sharing the answer. The company, based in Austria, launched last July.

Cynthia Murrell, October 17, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Search Engine Patent

October 16, 2013

SearchYourCloud called my attention to the US patent “Search Engine.” The number is US849573. You can snag a copy at the USPTO via its search engine. Be sure to refresh yourself about the USPTO syntax. Simon Bain, the inventor, is now a senior manager at SearchYourCloud. For those who want to keep pace with new methods germane to search, I found the explanation of the query expansion and deduplication processes of interest. You can get more information about SearchYourCloud at this link. Worth a look.

Stephen E Arnold, October 16, 2013

Xenky Search Vendor Profile: Entopia

October 15, 2013

I have posted a profile of the now offline enterprise search vendor Entopia. You can access the write up at www.xenky.com/vendor-profiles.

Entopia is an interesting case. The company, like Endeca and Fast Search & Transfer, had embraced the idea that information access was the DNA of an organization. With access to information and metadata, a manager could make better decisions. The marketers jumped on the bandwagon and rolled out some fancy buzzwords to surround the incredibly complex Entopia system.

The Entopia approach is, in my opinion, one that took the SAP R/3 massive reengineering of work processes and applied the notion to information. Entopia included Tacit type tracking to identify people who were centers of influence in a company, search, concepts, automatic indexing, semantics, etc.

The only problem was that the cost of implementing the system once a client had been found was high. In 2006, the company wound down. The firm is still offline, but its very ambitious explanations of what information could do inspired many other vendors.

Like Convera, Entopia described a wonderful world of information access. The problem was and still is delivering in a way that meets users’ expectations and delivers a visible, easily documented payoff to the organization buying the dream and the software.

The profiles will not be updated or maintained. I am providing the information because some students may find the explanations, diagrams, and comments of interest. The information is provided on an “as is” basis. If you want to use this for commercial purposes, please, contact me at seaky2000 at yahoo dot com.

Remember. I am almost 70 years old and some of the final versions of these profiles commanded hefty fees. A reader reminded me that some big outfits have taken my work and reused it, sometimes with permission and sometimes not. Well, these are for your personal use.

Stephen E Arnold, October 15, 2013

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta