CyberOSINT banner

Semantic Search: A Return to Hieroglyphics

May 20, 2015

I am so out of date, lost in time, and dumb that I experienced a touch of nausea when I read “Feeligo Expands Semantic Search for Branded Online Stickers.” Feeligo I learned is “a leading provided of branded stickers for online conversations.”

image

Source: http://blog.feeligo.com/wp-content/uploads/2014/06/vick_export_04.png

The leap from a sticker to semantic search is dazzling. According to the write up, Feeligo has 500 million users. These folks are doing semantic search. How does this sticker-semantic marriage work? The article says:

Feeligo has developed a platform that capitalizes on the growing awareness of marketing to online and mobile users through social conversations, including comment forums and user forums. Feeligo offers clients a plug-and-play solution for all messaging services, complete with generic and branded stickers, which are installed on client sites. Through Feeligo’s semantic recommendation algorithms, direct matches between words and phrases in users’ text conversations and stickers are made, enabling users to quickly find the appropriate sticker for a user’s message.

I have watched enterprise search vendors distort language in their remarkable attempts to generate sales. I have watched the search engine optimization crowd trash relevance and then embrace the jargon of RDF and Owl. I have now seen how purveyors of digital stickers have tapped semantic technology to make hieroglyphics a brand message technique.

Does anyone notice that a digital sticker is a cartoon empowered to generate three views every second? Do these sticker consumers consider dipping into William James or Charles Dickens? Nah, no stickers for that irrelevant material.

Stephen E Arnold, May 20, 2015

Developing an NLP Semantic Search

May 15, 2015

Can you imagine a natural language processing semantic search engine?  It would be a lovely tool to use in your daily routines and make research a bit easier.  If you are working on such a project and are making a progress, keep at that startup because this is lucrative field at the moment.  Over at Stack Overflow, an entrepreneuring spirit is trying to develop a “Semantic Search With NLP And Elasticsearch”:

“I am experimenting with Elasticsearch as a search server and my task is to build a “semantic” search functionality. From a short text phrase like “I have a burst pipe” the system should infer that the user is searching for a plumber and return all plumbers indexed in Elasticsearch.

Can that be done directly in a search server like Elasticsearch or do I have to use a natural language processing (NLP) tool like e.g. Maui Indexer. What is the exact terminology for my task at hand, text classification? Though the given text is very short as it is a search phrase.”

Given that this question was asked about three years ago, a lot has been done not only with Elasticsearch, but also NLP.  Search is moving towards a more organic experience, but accuracy is often muddled by different factors.  These include the quality of the technology, classification, taxonomies, ads in results, and even keywords (still!).

NLP semantic search is closer now than it was three years ago, but technology companies would invest a lot of money in a startup that can bridge the gap between natural language and machine learning.

Whitney Grace, May 15, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

The Philosophy of Semantic Search

May 13, 2015

The article Taking Advantage of Semantic Search NOW: Understanding Semiotics, Signs, & Schema on Lunametrics delves into semantics on a philosophical and linguistic level as well as in regards to business. He goes through the emergence of semantic search beginning with Ray Kurzweil’s interest in machine learning meaning as opposed to simpler keyword search. In order to fully grasp this concept, the author of the article provides a brief refresher on Saussure’s semantics.

“a Sign is comprised of a signifier, or the name of a thing, and the signified, what that thing represents… Say you sell iPad accessories. “iPad case” is your signifier, or keyword in search marketing speak. We’ve abused the signifier to the utmost over the years, stuffing it onto pages, calculating its density with text tools, jamming it into title tags, in part because we were speaking to robot who read at a 3-year-old level.”

In order to create meaning, we must go beyond even just the addition of price tag and picture to create a sign. The article suggests the need for schema, in the addition of some indication of whom and what the thing is for. The author, Michael Bartholow, has a background in linguistics and marketing and search engine optimization. His article ends with the question of when linguists, philosophers and humanists will be invited into the conversation with businesses, perhaps making him a true visionary in a field populated by data engineers with tunnel-vision.

Chelsea Kerwin, May 13, 2014

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

RichRelevance Promises Complete Omnichannel Personalization

May 7, 2015

The article on MarketWatch titled RichRelevance Extends Its Partner Ecosystem to Support True Omnichannel Personalization predicts the consequences of San Francisco-based company RichRelevance’s recent announcement that they will be amping up partner support in order to improve the continuity of the customer experience across “web, mobile, call center and store.” The article explains what is meant by omnichannel personalization and why it is so important,

“Personalization has emerged as the most important strategic imperative for global businesses,” said Eduardo Sanchez, CEO of RichRelevance. “Our partner ecosystem provides our customers with a unique resource to support the implementation of different components of the Relevance Cloud in their business, as well as customize personalization according to the highly specific demands of their own businesses and consumer base.” Gartner predicts that 89% of companies plan to compete primarily on the basis of the customer experience by 2016…”

The Relevance Cloud is available for Richrelevance partners and includes such core capabilities as Pre-built personalization apps for recommendations and search, the Open Innovation Platform for Build, and Relevance in Store for the reported 90% of sales that occur in-store. The announcement ensures that the collaboration Richrelevance emphasizes with its partners will really range all areas of customer engagement.

Chelsea Kerwin, May 7, 2014

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Cerebrant Discovery Platform from Content Analyst

May 6, 2015

A new content analysis platform boasts the ability to find “non-obvious” relationships within unstructured data, we learn from a write-up hosted at PRWeb, “Content Analyst Announces Cerebrant, a Revolutionary SaaS Discovery Platform to Provide Rapid Insight into Big Content.” The press release explains what makes Cerebrant special:

“Users can identify and select disparate collections of public and premium unstructured content such as scientific research papers, industry reports, syndicated research, news, Wikipedia and other internal and external repositories.

“Unlike alternative solutions, Cerebrant is not dependent upon Boolean search strings, exhaustive taxonomies, or word libraries since it leverages the power of the company’s proprietary Latent Semantic Indexing (LSI)-based learning engine. Users simply take a selection of text ranging from a short phrase, sentence, paragraph, or entire document and Cerebrant identifies and ranks the most conceptually related documents, articles and terms across the selected content sets ranging from tens of thousands to millions of text items.”

We’re told that Cerebrant is based on the company’s prominent CAAT machine learning engine. The write-up also notes that the platform is cloud-based, making it easy to implement and use. Content Analyst launched in 2004, and is based in Reston, Virginia, near Washington, DC. They also happen to be hiring, in case anyone here is interested.

Cynthia Murrell, May 6, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Continued Growth and Success at Syl Semantics

May 5, 2015

The article on Yahoo New Zealand titled Syl Semantics Raises New Capital and Appoints New Directors begins by naming the two freshly-minted non-executive directors, Murray Nash and Gene Turner. This is the result of successful capital raising to the tune of a million dollars for the Wellington-based company. Syl Semantics will continue to focus on growing the company with the assistance of the new directors. The article explains,

“Murray Nash is Managing Director of Zusammen, an advisory firm specialising in strategy, finance and capital markets, risk management, and public policy. In 2013 Murray was manager of the Establishment Unit and subsequently the acting Chief Executive of Callaghan Innovation. Murray has been a senior manager in three financial risk management start-ups in New York – supplying technology solutions to global leaders in banking, insurance, asset management and prudential supervision. He has a MComm (Finance) from the University of Auckland.”

Gene Turner’s background is in law and banking. Syl Semantics was created in 2008 and has grown steadily since then, releasing Syl Search in 2011 with great success. Syl Semantics is focused on what they term “Information Intelligence” or the “ability to access and extract value, meaning and learning from information.” James Fowler, the Director of Sales and Marketing, spoke to the ambition and perseverance of the company, which hopes to gain more of a foothold in New Zealand and Australian markets.

Chelsea Kerwin, May 5, 2014

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Semantic Search and Dolphins

May 2, 2015

Do you remember Dolphin Search? I have some information in my Overflight archive. One Dolphin was a commercial search system. I made a note, “Based on an analysis of dolphin sounds.” I never followed up. I have another reference to an open source search system. Here’s a screenshot I snagged:

Figure 1: The Dolphin search tool allows you to find files and folders by name.

I located one of the gosling’s notes. The main point, “Flakey.”

Dolphins surfaced again (heh heh heh) in a write up with the tsunami of a title “How Dolphins Saved Semantic Search.”

This emergence of dolphins from the pool of information access is fascinating. Here’s the lead paragraph:

“We don’t talk about trust and identity much, it’ll be a good subject to discuss,”Teodora Petkova explained to me over an email. We were discussing my participation at the Sofia SEO Conference 2015 organized by Ognian Mladenov and my keynote opening speech on semantic search.

Yes, directly designed to hook me with a tasty morsel of bait or snare me in a mile long drag net.

I had to wait until the sixth paragraph to get to the dolphin and semantic search bobber.

I’d read about the Irrawaddy river dolphins while doing some research on human cooperation behavior. The Irrawaddy river is in Myanmar (former Burma) and the local fishermen had managed to find a way to communicate with the river dolphins. More than that the two of them had managed to create a shared language. The dolphins could, through their behavior, tell the fishermen just how big the catch they were bringing in was. The Fishermen would call out to the dolphins as necessary. This is not just cooperative behavior, it is a mutualistic relationship  where both parties work together, sharing the workload so that the burden involved becomes less and the rewards for each, become greater. It is, in other words, contact between two intelligences. Search is similar.

Well, there’s that.

The write up drifts from “semantic” into a haze much like the mist that rises on some mornings from the south facing inlet on Guarujá. Where the write up drifts is the conflation of humanness and semantic search. The metaphor which drifts to mind is the plastic trash and detritus that disfigure beaches.

Well, there’s that.

Toss the chum of search engine optimization into the murky water and what emerges is the dolphin.

I think that this makes perfect sense to a person less wise than Hemmingway’s Old Man in the novella foisted on clueless sea farers in rural Illinois. The logic meshes with warnings like “red sky at morning, sailors take warning” or something similar.

Dolphins seem not to notice the weather. Dolphins do notice fish dangled in front of their noses at Sea World.

With semantic search playing the key part in this fish tale, the question arises, “What the heck does this have to do with information retrieval.”

Empty net. For sure.

Stephen E Arnold, May 2, 2015

Cerebrant Discovery Platform from Content Analyst

April 29, 2015

A new content analysis platform boasts the ability to find “non-obvious” relationships within unstructured data, we learn from a write-up hosted at PRWeb, “Content Analyst Announces Cerebrant, a Revolutionary SaaS Discovery Platform to Provide Rapid Insight into Big Content.” The press release explains what makes Cerebrant special:

“Users can identify and select disparate collections of public and premium unstructured content such as scientific research papers, industry reports, syndicated research, news, Wikipedia and other internal and external repositories.

“Unlike alternative solutions, Cerebrant is not dependent upon Boolean search strings, exhaustive taxonomies, or word libraries since it leverages the power of the company’s proprietary Latent Semantic Indexing (LSI)-based learning engine. Users simply take a selection of text ranging from a short phrase, sentence, paragraph, or entire document and Cerebrant identifies and ranks the most conceptually related documents, articles and terms across the selected content sets ranging from tens of thousands to millions of text items.”

We’re told that Cerebrant is based on the company’s prominent CAAT machine learning engine. The write-up also notes that the platform is cloud-based, making it easy to implement and use. Content Analyst launched in 2004, and is based in Reston, Virginia, near Washington, DC. They also happen to be hiring, in case anyone here is interested.

Cynthia Murrell, April 29, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

LexisNexis: Riding the Patent Pony

April 25, 2015

Need patent information? Lots of folks believed that making sense of the public documents available from the USPTO were the road to riches. Before I kicked back to enjoy the sylvan life in rural Kentucky, I did some work on Fancy Dan patent systems. There was a brush with the IBM Intelligent Patent Miner system. For those who do not recall their search history, you can find a chunk of information in “Information Mining with the IBM Intelligent Miner Family.” Keep in mind that the write up is about 20 years old. (Please, notice that the LexisNexis system discussed below uses many of the same, time worn techniques.)

image

Patented dog coat.

Then there was the Manning & Napier “smart” patent analysis system with analyses’ output displayed in three-D visualizations. I bumped into Derwent (now Intellectual Property & Science) and other Thomson Corp. solutions as well. And, of course, there was may work for an unnamed, mostly clueless multi billion dollar outfit related to Google’s patent documents. I summarized the results of this analysis in my Google Version 2.0 monograph, portions of which were published by BearStearns before it met its thrilling end seven years ago. (Was my boss the fellow carrying a box out of the Midtown BearStearns’ building?)

Why the history?

Well, patents are expensive to litigate. For some companies, intellectual property is a revenue stream.

There is a knot in the headphone cable. Law firms are not the go go business they were 15 or 20 years ago. Law school grads are running gyms; some are Uber drivers. Like many modern post Reagan businesses, concentration is the name of the game. For the big firms with the big buck clients, money is no object.

The problem in the legal information business is that smaller shops, including the one and two person outfits operating in Dixie Highway type of real estate do not want to pay for the $200 and up per search commercial online services charge. Even when I was working for some high rollers, the notion of a five or six figure online charge elicited what I would diplomatically describe as gentle push back.

I read “LexisNexis TotalPatent Keeps Patent Research out of the Black Box with Improved Version of Semantic Search.” For those out of touch with online history, I worked for a company in the 1980s which provided commercial databases to LexisNexis. I knew one of the founders (Don Wilson). I even had reasonably functional working relationships with Dan Prickett and people named “Jim” and “Sharon.” In one bizarre incident, a big wheel from LexisNexis wanted to meet with me in the Cherry Hill Mall’s parking lot across from the old Bell Labs’ facility where I was a consultant at the time. Err, no thanks. I was okay with the wonky environs of Bell Labs. I was not okay with the lash up of a Dutch and British company.

image

Snippet of code from a Ramanathan Guha invention. Guha used to be at IBM Almaden and he is a bright fellow. See US7593939 B2.

What does LexisNexis TotalPatent deliver for a fee? According to the write up:

TotalPatent, a web-based patent research, retrieval and analysis solution powered by the world’s biggest assortment of searchable full-text and bibliographic patent authorities, allows researchers to enter as much as 32,000 characters (comparable to more than 10 pages of text)—much over along a whole patent abstract—into its search industry. The newly enhanced semantic brains, pioneered by LexisNexis during 2009 and continually improved upon utilizing contextual information supplied by the useful patent data offered to the machine, current results in the form of a user-adjustable term cloud, where the weighting and positioning of terms may be managed for lots more precise results. And countless full-text patent documents, TotalPatent in addition utilizes systematic, technical also non-patent literature to go back the deepest, most comprehensive serp’s.

Read more

Ontotext Pursues Visibility

April 23, 2015

Do you know Ontotext? The company is making an effort to become more visible. Navigate to “Vassil Momtchev talks Insights with the Bloor Group.” The interview provides a snapshot of the company’s history which dates from 2001. After 14 years, the interview reports that Ontotext “keeps its original company spirit.”

Other points from the write up:

  • The company’s technology makes use of semantic and ontology modeling
  • A knowledge base represents complex information and makes asking questions better
  • Semantic applications can deliver complete applications.

For more information about Ontotext and its “ontological” approach, visit the company’s Web site at www.ontotext.com.

Stephen E Arnold, April 23, 2015

« Previous PageNext Page »