CyberOSINT banner

How Often Do You Use Vocal Search

February 8, 2016

Vocal search is an idea from the future: you give a computer a query and it returns relevant information.   However, vocal search has become an actual “thing” with mobile assistants like Siri, Cortana, and build in NLP engines on newer technology.  I enjoy using vocal search because it saves me from having to type my query on a tiny keyboard, but when I’m in a public place I don’t use it for privacy reasons.  Search Engine Watch asks the question, “What Do You Need To Know About Voice Search?” and provides answers for me more questions about vocal search.

Northstar Research conducted a study that discovered 55% percent of US teens used vocal search, while only 41% of US adults do.  An even funnier fact is that 56% of US adults only use the search function, because it makes them feel tech-savvy.

Vocal Search is extremely popular in Asia due to the different alphabets.  Asian languages are harder to type on a smaller keyboard.  It is also a pain on Roman alphabet keyboards!

Tech companies are currently working on new innovations with vocal search.  The article highlights how Google is trying to understand the semantic context behind queries for intent and accuracy.

“Superlatives, ordered items, points in time and complex combinations can now be understood to serve you more relevant answers to your questions…These ‘direct answers’ provided by Google will theoretically better match the more natural way that people ask questions in speech rather then when typing something into a search bar, where keywords can still dominate our search behaviour.”

It translates to a quicker way to access information and answer common questions without having to type on a keyboard.  Now it would be a lot easier if you did not have to press a button to activate the vocal search.

Whitney Grace, February 8, 2016
Sponsored by, publisher of the CyberOSINT monograph

Semantic Mitosis: Attensity Splits Apart

February 7, 2016

Attensity is now two outfits. According to “Attensity Europe Breaks from US Parent Company.” The write up does not address the loss of synergy between the US and European sides of the semantic coin.

I learned:

The parties involved have agreed to not disclose any information on the purchase price or further terms of the transaction.

Not too helpful.

The news release points out that the European version of Attensity which is named Attensity Europe GmbH will focus on the customer support line of business; specifically:

the [Attensity Europe] company will focus on the growth segment of omni-channel customer service. Attensity Europe’s core product is the market-leading solution “Respond”, a multilingual and omni-channel response management software, which was designed by the German team of developers in Saarbrücken and has been systematically developed into the market-leading enterprise solution for omni-channel customer service over recent years.I assume that the US Attensity does not have a market leading product; otherwise, why not mention it? Omni-channel gets quite a bit of play. But I am not sure what “omni channel means.”

As an aside, Saarbrücken divorced itself from Germany: Once in 1925 and again in 1947. Might the water in the Saar be a factor in the split ups?

Many questions percolate through my discount coffee pot brain, but these are the questions I routinely ask when reverse mergers, no investment acquisitions, and de-synergies are at work.

My hunch is that US Attensity may have been perceived as slowing down the speeding bullet of Attensity Europe. Worth monitoring the situation.

Stephen E Arnold, February 7, 2016

A How to Create Jargon Tutorial

January 21, 2016

I read “Controversial Concepts: How to Tackle Defining and Naming Them.” The write up explains how to whip up jargon and get it into circulation. I thought, “Just what I need. I want to make ideas more confusing and more difficult to discuss. Hooray.”

Here’s the method:

  1. Name controversial concepts with proxy names such as “Greg”, “Mike” or “John” (or whatever name you prefer) to get potentially misleading names and their implicit connotations out of the way of progress.
  2. Draw a concept diagram showing those concepts as well as important semantic relationships among them.
  3. Formulate intensional definitions for each concept – still using the proxy names. Ensure that those definitions are consistent with the relationships shown on the concept diagram.
  4. Identify one or more communities that “baptize” those concepts by giving them better names.

Not as clear as Lotus 1-2-3 because the “intensiional definitions” threw me. After a bit of thinking, I realized that I could create really useful, clear, high impact words and phrases like:

  • artificial intelligence
  • Big Data
  • cognitive computing
  • concept search
  • data lake
  • metadata
  • natural language

That is outstanding.

Stephen E Arnold, January 21, 2016

Semantic Machines: A Voice Search Revolution?

January 19, 2016

I read “Newton Startup Scoops Up Talent As It Works to Perfect Artificial Intelligence.” The write up takes an enthusiastic approach to the efforts of a smart software company in the Boston area. I like these types of articles. They remind me of the days when Route 128 was the cat’s pajamas.

I learned that when I talk to my phone, the system is not “smart enough.” I know. Background noise, speaking too quickly, or mumbling are issues with the voice to search thing. Then there is the output. Our test involves asking for the phone number of a person with a Russian name like Kolmogorov in a bus station or a convertible going 40 miles per hour.

The write up points out:

Semantic Machines is currently working on artificial intelligence technology that could do a better job than Siri or other platforms as they interact with users.

There is big money involved; for example, $20 million from the Bainies and other illuminati.

Here’s the angle:

…The idea behind the startup is to develop a “new paradigm” in a field known as conversational computing — essentially improving the way you interact with your phone or computer, whether via voice or text — “much, much closer to the conversational style in the way people talk…”

Worth noting.

Stephen E Arnold, January 19, 2016

Search Is Marketing and Lots of Other Stuff Like Semantics

January 12, 2016

I spoke with a person who asked me, “Have you seen the 2013 Dave Amerland video? The video in question is “Google Semantic Search and its Impact on Business.”

I hadn’t. I watched the five-minute video and formed some impressions / opinions about the information presented. Now I wish I had not invested five minutes in serial content processing.

First, the premise is that search is marketing does not match up with my view of search. In short, search is more than marketing, although some view search as essential to making a sale.

Second, the video generates buzzwords. There’s knowledge graph, semantic, reputation, Big Data, and more. If one accepts the premise that search is about sales, I am not sure what these buzzwords contribute. The message is that when a user looks for something, the system should display a message that causes a sale. Objectivity does not have much to do with this, nor do buzzwords.

Third, presentation of the information was difficult for me to understand. My attention was undermined by the wild and wonderful assertions about the buzzwords. I struggled with “from stings to things, from Web sites to people.” What?

The video is ostensibly about the use of “semantics” in content. I am okay with semantic processes. I understand that keeping words and metaphors consistent are helpful to a human and to a Web indexing system.

But the premise. I have a tough time buying in. I want search to return high value, on point content. I want those who create content to include helpful information, details about sources, and markers that make it possible for a reader to figure out what’s sort of accurate and what’s opinion.

I fear that the semantics practiced in this video shriek, “Hire me.” I also note that the video is a commercial for a book which presumably amplifies the viewpoint expressed in the video. That means the video vocalizes, “Buy my book.”

Heck, I am happy if I can an on point result set when I run a query. No shrieking. No vocalization. No buzzwords. Will objective search be possible?

Stephen E Arnold, January 12, 2016

Need an Open Source Semantic Web Crawler?

December 17, 2015

If you do, the beleaguered Yahoo has some open source goodies for you. Navigate to “Yahoo Open Sources Anthelion Web Crawler for Parsing Structured Data on HTML Pages.” The software, states the write up, is “designed for parsing structured data from HTML pages under an open source license.”

There is a statement I found darned interesting:

“To the best of our knowledge, we are first to introduce the idea of a crawler focusing on semantic data, embedded in HTML pages using markup languages as microdata, microformats or RDFa,” wrote authors Peter Mika and Roi Blanco of Yahoo Labs and Robert Meusel of Germany’s University of Mannheim.

My immediate thought was, “Why don’t these folks take a look at the 2007 patent documents penned by Ramanathan Guha. Those documents present a rather thorough description of a semantic component which hooks into the Google crawlers. Now the Google has not open sourced these systems and methods.

My reaction is, “Yahoo may want to ask the former Yahooligans who are now working at Yahoo how novel the Yahoo approach really is.”

Failing that, Yahoo may want to poke around in the literature, including patent documents, to see which outfits have trundled down the semantic crawling Web thing before. Would it have been more economical and efficient to license the Siderean Software crawler and build on that?

Stephen E Arnold, December 17, 2015

Semantics and the Web: The Bacon Has Been Delivered

November 12, 2015

I read and viewed “What Happened to the Semantic Web.” For one thing, the search engine optimization has snagged the idea in order to build interest in search result rankings. The other thing I know if that most people are blissfully unaware of what semantics are supposed to be and how semantics impacts their lives. Many folks are thrilled when their mobile phone points them to a pizza joint or out of an unfamiliar part of town.

The write up explains that for the last 15 years there has been quite a bit of the old rah rah for semantics on the Web. Well, the semantics are there. The big boys like Google and Microsoft are making this happen. If you are interested in triples, POST, and RDF, you can work through the acronyms and get to the main points of the article.

The bulk of the write up is a series of comparative screen shots. I looked at these and tried to replicate a couple of them. I was not able to derive the same level of thrillness which the article expresses. Your mileage may vary.

Here’s the passage I highlighted in a definitely pale shade of green:

As you can see, there is no question that the Web already has a population of HTML documents that include semantically-enriched islands of structured data. This new generation of documents creates a new Web dimension in which links are no longer seen solely as document addresses, but can function as unambiguous names for anything, while also enabling the construction of controlled natural language sentences for encoding and decoding information [data in context] — comprehensible by both humans and machines (bots). The fundamental goal of the Semantic Web Project has already been achieved. Like the initial introduction of the Web, there wasn’t an official release date — it just happened!

I surmise this is the semantic heaven described by Ramanathan Guha and his series of inventions, now almost a decade old. What’s left out is a small point: The semantic technology allows Google and some other folks to create a very interesting suite of databases. Good or bad? I will leave it to you to revel in this semantic fait accompli.

Stephen E Arnold, November 12, 2015

Another Semantic Search Play

November 6, 2015

The University of Washington has been search central for a number of years. Some interesting methods have emerged. From Jeff Dean to Alon Halevy, the UW crowd has been having an impact.

Now another search engine with ties to UW wants to make waves with a semantic search engine. Navigate to “Artificial-Intelligence Institute Launches Free Science Search Engine.” The wizard behind the system is Dr. Oren Etzioni. The money comes from Paul Allen, a co founder of Microsoft.

Dr. Etzioni has been tending vines in the search vineyard for many years. His semantic approach is described this way:

But a search engine unveiled on 2 November by the non-profit Allen Institute for Artificial Intelligence (AI2) in Seattle, Washington, is working towards providing something different for its users: an understanding of a paper’s content. “We’re trying to get deep into the papers and be fast and clean and usable,” says Oren Etzioni, chief executive officer of AI2.

Sound familiar: Understanding what a sci-tech paper means?

According to the write up:

Semantic Scholar offers a few innovative features, including picking out the most important keywords and phrases from the text without relying on an author or publisher to key them in. “It’s surprisingly difficult for a system to do this,” says Etzioni. The search engine uses similar ‘machine reading’ techniques to determine which papers are overviews of a topic. The system can also identify which of a paper’s cited references were truly influential, rather than being included incidentally for background or as a comparison.

Does anyone remember Gene Garfield? I did not think so. There is a nod to Expert System, an outfit which has been slogging semantic technology in an often baffling suite of software since 1989. Yep, that works out to more than a quarter of a century.) Hey, few doubt that semantic hoohah has been a go to buzzword for decades.

There are references to the Microsoft specialist search and some general hand waving. The fact that different search systems must be used for different types of content should raise some questions about the “tuning” required to deliver what the vendor can describe as relevant results. Does anyone remember what Gene Garfield said when he accepted the lifetime achievement award in online? Right, did not think so. The gist was that citation analysis worked. Additional bells and whistles could be helpful. But humans referencing substantive sci-tech antecedents was a very useful indicator of the importance of a paper.

I interpreted Dr. Garfield’s comment as suggesting that semantics could add value if the computational time and costs could be constrained. But in an era of proliferating sci-tech publications, bells and whistles were like chrome trim on a 59 Oldsmobile 98. Lots of flash. Little substance.

My view is that Paul Allen dabbled in semantics with Evri. How did that work out? Ask someone from the Washington Post who was involved with the system.

Worth testing the system in comparative searches against commercial databases like Compendex, ChemAbs, and similar high value commercial databases.

Stephen E Arnold, November 5, 2015

Data Lake and Semantics: Swimming in Waste Water?

November 6, 2015

I read a darned fascinating write up called “Use Semantics to Keep Your Data Lake Clear.” There is a touch of fantasy in the idea of importing heterogeneous “data” into a giant data lake. The result is, in my experience, more like waste water in a pre-treatment plant in Saranda, Albania. Trust me. Distasteful.

Looks really nice, right?

The write up invokes a mid tier consultant and then tosses in the fuzzy word term governance. We are now on semi solid ground, right? I do like the image of a data swap which contrasts nicely with the images from On Golden Pond.

I noted this passage:

Using a semantic data model, you represent the meaning of a data string as binary objects – typically in triplicates made up of two objects and an action. For example, to describe a dog that is playing with a ball, your objects are DOG and BALL, and their relationship is PLAY. In order for the data tool to understand what is happening between these three bits of information, the data model is organized in a linear fashion, with the active object first – in this case, DOG. If the data were structured as BALL, DOG, and PLAY, the assumption would be that the ball was playing with the dog. This simple structure can express very complex ideas and makes it easy to organize information in a data lake and then integrate additional large data stores.


Next I circled:

A semantic data lake is incredibly agile. The architecture quickly adapts to changing business needs, as well as to the frequent addition of new and continually changing data sets. No schemas, lengthy data preparation, or curating is required before analytics work can begin. Data is ingested once and is then usable by any and all analytic applications. Best of all, analysis isn’t impeded by the limitations of pre-selected data sets or pre-formulated questions, which frees users to follow the data trail wherever it may lead them.

Yep, makes perfect sense. But there is one tiny problem. Garbage in, garbage out. Not even modern jargon can solve this decades old computer challenge.

Fantasy is much better than reality.

Stephen E Arnold, November 6, 2015

Whitepaper: Plan for Holiday Sales Now

October 16, 2015

Marketing pros and retailers take note: semantic tech firm ntent offers a free whitepaper to help you make the most of the upcoming holiday season, titled “Step-By-Step Guide to Holiday Campaign Planning.” All they want in return are your web address, contact info, and the chance to offer you a subscription to their newsletter, blog, and updates. (That checkbox is kindly deselected by default.) The whitepaper’s description states:

“Halloween candy and costumes are already overflowing on retail stores shelves. You know what that means, don’t you? It’s time for savvy marketers to get serious about their online retail planning for the impending holidays, if they haven’t already started. Why is it so important to take the time to coordinate a solid holiday campaign? Because according to the National Retail Federation [PDF] the holiday season can account for more than 20–40% of a retailer’s annual sales. And if that alone isn’t enough to motivate you, Internet Retailer reported that online retail sales this year are predicted to reach $349.06 billion a 14.2% YoY increase—start planning now to get your piece of the pie! Position your business for online success, more sales and more joy as you head into 2016 using these easy-to-follow, actionable tips!”

The paper includes descriptions of tactics and best practices, as well as a monthly to-do list and a planning worksheet. Founded in 2010, ntent leverages their unique semantic search technology to help clients quickly find the information they need. The company currently has several positions open at their Carlsbad, California, office.

Cynthia Murrell, October 16, 2015

Sponsored by, publisher of the CyberOSINT monograph

Next Page »