CyberOSINT banner

The Future of Enterprise and Web Search: Worrying about a Very Frail Goose

May 28, 2015

For a moment, I thought search was undergoing a renascence. But I was wrong. I noted a chart which purports to illustrate that the future is not keyword search. You can find the illustration (for now) at this Twitter location. The idea is that keyword search is less and less effective as the volume of data goes up. I don’t want to be a spoil sport, but for certain queries key words and good old Boolean may be the only way to retrieve certain types of information. Don’t believe me. Log on to your organization’s network or to Google. Now look for the telephone number of a specific person whose name you know or a tire company located in a specific city with a specific name which you know. Would you prefer to browse a directory, a word cloud, a list of suggestions? I want to zing directly to the specific fact. Yep, key word search. The old reliable.

But the chart points out that the future is composed of three “webs”: The Social Web, the Semantic Web, and the Intelligent Web. The dates for the Intelligent Web appears to be 2018 (the diagram at which I am looking is fuzzy). We are now perched half way through 2015. In 30 months, the Intelligent Web will arrive with these characteristics:

Embedded image permalink

  • Web scale reasoning (Don’t we have Watson? Oh, right. I forgot.)
  • Intelligent agents (Why not tap Connotate? Agents ready to roll.)
  • Natural language search (Yep, talk to your phone How is that working out on a noisy subway train?)
  • Semantics. (Embrace the OWL. Now.)

Now these benchmarks will arrive in the next 30 months, which implies a gradual emergence of Web 4.0.

The hitch in the git along, like most futuristic predictions about information access, is that reality behaves in some unpredictable ways. The assumption behind this graph is “Semantic technology help to regain productivity in the face of overwhelming information growth.”

Read more

Hijacking Semantics for Search Engine Optimization

May 26, 2015

I am just too old and cranky to get with the search engine optimization program. If a person cannot find your content, too bad. SEO has caused some of the erosion of relevance across public Web search engines.

The reason is that pages with lousy content are marketed as having other, more valuable content. The result is queries like this:

image

I want information about methods of digital reasoning. What I get is a company profile.

How do I get information for my specific requirement? I have to know how to work around the problems SEO puts in my face every day, over and over again.

This query works on Bing, Google, and Yandex: artificial intelligence decision procedures.

image

The results do not point to a small company in Tennessee, but to substantive documents from which other, pointed queries can be launched for individuals, industry associations, and methods.

When I read “Semantic Search Strategies That Work,” I became agitated. The notion of “forgetting about content” and “focusing on quality” miss the mark. Telling me to “spend time on engagement” are a collection of unrelated assertions.

The goal of semantics for SEO is to generate traffic. The search systems suck in shaped content and persist in directing people to topics that may have little or nothing to do with the information a person needs to solve his or her problem.

In short, the bastardization of semantics in the name of SEO is ensuring that some users will define the world from the point of view of marketing, not objective information.

What’s the fix?

Here’s the shocker: There is no fix. As individuals abrogate their responsibility to demand high value, on point results, schlock becomes the order of the day.

So much for clear thinking. Semantic strategies that erode relevance do not “work” from my point of view. This type of semantics thickens the cloud of unknowning.

Stephen E Arnold, May 26, 2015

Semantic Search: A Return to Hieroglyphics

May 20, 2015

I am so out of date, lost in time, and dumb that I experienced a touch of nausea when I read “Feeligo Expands Semantic Search for Branded Online Stickers.” Feeligo I learned is “a leading provided of branded stickers for online conversations.”

image

Source: http://blog.feeligo.com/wp-content/uploads/2014/06/vick_export_04.png

The leap from a sticker to semantic search is dazzling. According to the write up, Feeligo has 500 million users. These folks are doing semantic search. How does this sticker-semantic marriage work? The article says:

Feeligo has developed a platform that capitalizes on the growing awareness of marketing to online and mobile users through social conversations, including comment forums and user forums. Feeligo offers clients a plug-and-play solution for all messaging services, complete with generic and branded stickers, which are installed on client sites. Through Feeligo’s semantic recommendation algorithms, direct matches between words and phrases in users’ text conversations and stickers are made, enabling users to quickly find the appropriate sticker for a user’s message.

I have watched enterprise search vendors distort language in their remarkable attempts to generate sales. I have watched the search engine optimization crowd trash relevance and then embrace the jargon of RDF and Owl. I have now seen how purveyors of digital stickers have tapped semantic technology to make hieroglyphics a brand message technique.

Does anyone notice that a digital sticker is a cartoon empowered to generate three views every second? Do these sticker consumers consider dipping into William James or Charles Dickens? Nah, no stickers for that irrelevant material.

Stephen E Arnold, May 20, 2015

Developing an NLP Semantic Search

May 15, 2015

Can you imagine a natural language processing semantic search engine?  It would be a lovely tool to use in your daily routines and make research a bit easier.  If you are working on such a project and are making a progress, keep at that startup because this is lucrative field at the moment.  Over at Stack Overflow, an entrepreneuring spirit is trying to develop a “Semantic Search With NLP And Elasticsearch”:

“I am experimenting with Elasticsearch as a search server and my task is to build a “semantic” search functionality. From a short text phrase like “I have a burst pipe” the system should infer that the user is searching for a plumber and return all plumbers indexed in Elasticsearch.

Can that be done directly in a search server like Elasticsearch or do I have to use a natural language processing (NLP) tool like e.g. Maui Indexer. What is the exact terminology for my task at hand, text classification? Though the given text is very short as it is a search phrase.”

Given that this question was asked about three years ago, a lot has been done not only with Elasticsearch, but also NLP.  Search is moving towards a more organic experience, but accuracy is often muddled by different factors.  These include the quality of the technology, classification, taxonomies, ads in results, and even keywords (still!).

NLP semantic search is closer now than it was three years ago, but technology companies would invest a lot of money in a startup that can bridge the gap between natural language and machine learning.

Whitney Grace, May 15, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

The Philosophy of Semantic Search

May 13, 2015

The article Taking Advantage of Semantic Search NOW: Understanding Semiotics, Signs, & Schema on Lunametrics delves into semantics on a philosophical and linguistic level as well as in regards to business. He goes through the emergence of semantic search beginning with Ray Kurzweil’s interest in machine learning meaning as opposed to simpler keyword search. In order to fully grasp this concept, the author of the article provides a brief refresher on Saussure’s semantics.

“a Sign is comprised of a signifier, or the name of a thing, and the signified, what that thing represents… Say you sell iPad accessories. “iPad case” is your signifier, or keyword in search marketing speak. We’ve abused the signifier to the utmost over the years, stuffing it onto pages, calculating its density with text tools, jamming it into title tags, in part because we were speaking to robot who read at a 3-year-old level.”

In order to create meaning, we must go beyond even just the addition of price tag and picture to create a sign. The article suggests the need for schema, in the addition of some indication of whom and what the thing is for. The author, Michael Bartholow, has a background in linguistics and marketing and search engine optimization. His article ends with the question of when linguists, philosophers and humanists will be invited into the conversation with businesses, perhaps making him a true visionary in a field populated by data engineers with tunnel-vision.

Chelsea Kerwin, May 13, 2014

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

RichRelevance Promises Complete Omnichannel Personalization

May 7, 2015

The article on MarketWatch titled RichRelevance Extends Its Partner Ecosystem to Support True Omnichannel Personalization predicts the consequences of San Francisco-based company RichRelevance’s recent announcement that they will be amping up partner support in order to improve the continuity of the customer experience across “web, mobile, call center and store.” The article explains what is meant by omnichannel personalization and why it is so important,

“Personalization has emerged as the most important strategic imperative for global businesses,” said Eduardo Sanchez, CEO of RichRelevance. “Our partner ecosystem provides our customers with a unique resource to support the implementation of different components of the Relevance Cloud in their business, as well as customize personalization according to the highly specific demands of their own businesses and consumer base.” Gartner predicts that 89% of companies plan to compete primarily on the basis of the customer experience by 2016…”

The Relevance Cloud is available for Richrelevance partners and includes such core capabilities as Pre-built personalization apps for recommendations and search, the Open Innovation Platform for Build, and Relevance in Store for the reported 90% of sales that occur in-store. The announcement ensures that the collaboration Richrelevance emphasizes with its partners will really range all areas of customer engagement.

Chelsea Kerwin, May 7, 2014

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Cerebrant Discovery Platform from Content Analyst

May 6, 2015

A new content analysis platform boasts the ability to find “non-obvious” relationships within unstructured data, we learn from a write-up hosted at PRWeb, “Content Analyst Announces Cerebrant, a Revolutionary SaaS Discovery Platform to Provide Rapid Insight into Big Content.” The press release explains what makes Cerebrant special:

“Users can identify and select disparate collections of public and premium unstructured content such as scientific research papers, industry reports, syndicated research, news, Wikipedia and other internal and external repositories.

“Unlike alternative solutions, Cerebrant is not dependent upon Boolean search strings, exhaustive taxonomies, or word libraries since it leverages the power of the company’s proprietary Latent Semantic Indexing (LSI)-based learning engine. Users simply take a selection of text ranging from a short phrase, sentence, paragraph, or entire document and Cerebrant identifies and ranks the most conceptually related documents, articles and terms across the selected content sets ranging from tens of thousands to millions of text items.”

We’re told that Cerebrant is based on the company’s prominent CAAT machine learning engine. The write-up also notes that the platform is cloud-based, making it easy to implement and use. Content Analyst launched in 2004, and is based in Reston, Virginia, near Washington, DC. They also happen to be hiring, in case anyone here is interested.

Cynthia Murrell, May 6, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Continued Growth and Success at Syl Semantics

May 5, 2015

The article on Yahoo New Zealand titled Syl Semantics Raises New Capital and Appoints New Directors begins by naming the two freshly-minted non-executive directors, Murray Nash and Gene Turner. This is the result of successful capital raising to the tune of a million dollars for the Wellington-based company. Syl Semantics will continue to focus on growing the company with the assistance of the new directors. The article explains,

“Murray Nash is Managing Director of Zusammen, an advisory firm specialising in strategy, finance and capital markets, risk management, and public policy. In 2013 Murray was manager of the Establishment Unit and subsequently the acting Chief Executive of Callaghan Innovation. Murray has been a senior manager in three financial risk management start-ups in New York – supplying technology solutions to global leaders in banking, insurance, asset management and prudential supervision. He has a MComm (Finance) from the University of Auckland.”

Gene Turner’s background is in law and banking. Syl Semantics was created in 2008 and has grown steadily since then, releasing Syl Search in 2011 with great success. Syl Semantics is focused on what they term “Information Intelligence” or the “ability to access and extract value, meaning and learning from information.” James Fowler, the Director of Sales and Marketing, spoke to the ambition and perseverance of the company, which hopes to gain more of a foothold in New Zealand and Australian markets.

Chelsea Kerwin, May 5, 2014

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Semantic Search and Dolphins

May 2, 2015

Do you remember Dolphin Search? I have some information in my Overflight archive. One Dolphin was a commercial search system. I made a note, “Based on an analysis of dolphin sounds.” I never followed up. I have another reference to an open source search system. Here’s a screenshot I snagged:

Figure 1: The Dolphin search tool allows you to find files and folders by name.

I located one of the gosling’s notes. The main point, “Flakey.”

Dolphins surfaced again (heh heh heh) in a write up with the tsunami of a title “How Dolphins Saved Semantic Search.”

This emergence of dolphins from the pool of information access is fascinating. Here’s the lead paragraph:

“We don’t talk about trust and identity much, it’ll be a good subject to discuss,”Teodora Petkova explained to me over an email. We were discussing my participation at the Sofia SEO Conference 2015 organized by Ognian Mladenov and my keynote opening speech on semantic search.

Yes, directly designed to hook me with a tasty morsel of bait or snare me in a mile long drag net.

I had to wait until the sixth paragraph to get to the dolphin and semantic search bobber.

I’d read about the Irrawaddy river dolphins while doing some research on human cooperation behavior. The Irrawaddy river is in Myanmar (former Burma) and the local fishermen had managed to find a way to communicate with the river dolphins. More than that the two of them had managed to create a shared language. The dolphins could, through their behavior, tell the fishermen just how big the catch they were bringing in was. The Fishermen would call out to the dolphins as necessary. This is not just cooperative behavior, it is a mutualistic relationship  where both parties work together, sharing the workload so that the burden involved becomes less and the rewards for each, become greater. It is, in other words, contact between two intelligences. Search is similar.

Well, there’s that.

The write up drifts from “semantic” into a haze much like the mist that rises on some mornings from the south facing inlet on Guarujá. Where the write up drifts is the conflation of humanness and semantic search. The metaphor which drifts to mind is the plastic trash and detritus that disfigure beaches.

Well, there’s that.

Toss the chum of search engine optimization into the murky water and what emerges is the dolphin.

I think that this makes perfect sense to a person less wise than Hemmingway’s Old Man in the novella foisted on clueless sea farers in rural Illinois. The logic meshes with warnings like “red sky at morning, sailors take warning” or something similar.

Dolphins seem not to notice the weather. Dolphins do notice fish dangled in front of their noses at Sea World.

With semantic search playing the key part in this fish tale, the question arises, “What the heck does this have to do with information retrieval.”

Empty net. For sure.

Stephen E Arnold, May 2, 2015

Cerebrant Discovery Platform from Content Analyst

April 29, 2015

A new content analysis platform boasts the ability to find “non-obvious” relationships within unstructured data, we learn from a write-up hosted at PRWeb, “Content Analyst Announces Cerebrant, a Revolutionary SaaS Discovery Platform to Provide Rapid Insight into Big Content.” The press release explains what makes Cerebrant special:

“Users can identify and select disparate collections of public and premium unstructured content such as scientific research papers, industry reports, syndicated research, news, Wikipedia and other internal and external repositories.

“Unlike alternative solutions, Cerebrant is not dependent upon Boolean search strings, exhaustive taxonomies, or word libraries since it leverages the power of the company’s proprietary Latent Semantic Indexing (LSI)-based learning engine. Users simply take a selection of text ranging from a short phrase, sentence, paragraph, or entire document and Cerebrant identifies and ranks the most conceptually related documents, articles and terms across the selected content sets ranging from tens of thousands to millions of text items.”

We’re told that Cerebrant is based on the company’s prominent CAAT machine learning engine. The write-up also notes that the platform is cloud-based, making it easy to implement and use. Content Analyst launched in 2004, and is based in Reston, Virginia, near Washington, DC. They also happen to be hiring, in case anyone here is interested.

Cynthia Murrell, April 29, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Next Page »