CyberOSINT banner

Semantic Search Becomes Search Engine Optimization: That Is Going to Improve Relevance

March 27, 2015

I read “The Rapid Evolution of Semantic Search.” It must be my age or the fact that it is cold in Harrod’s Creek, Kentucky, this morning. The write up purports to deliver “an overview of the history of semantic search and what this means for marketers moving forward.” I like that moving forward stuff. It reminds me of Project Runway’s “fashion forward.”

The write up includes a wonky graphic that equates via an arrow Big Data and metadata, volume, smart content, petabytes, data analysis, vast, structured, and framework. Big Data is a cloud with five little arrows pointing down. Does this mean Big Data is pouring from the sky like yesterday’s chilling rain?

The history of the Semantic Web begins in 1998. Let’s see that is 17 years ago. The milestone is in the context of the article, the report “Semantic Web road Map.” I learned that Google was less than a month old. I thought that Google was Backrub and the work on what was named Google begin a couple, maybe three years, earlier. Who cares?

The Big Idea is that the Web is an information space. That sounds good.

Well in 2012, something Big happened. According to the write up Google figured out that 20 percent of its searches were “new.” Aren’t those pesky humans annoying. The article reports:

long tail keywords made up approximately 70 percent of all searches. What this told Google was that users were becoming interested in using their search engine as a tool for answering questions and solving problems, not just looking up facts and finding individual websites. Instead of typing “Los Angeles weather,” people started searching “Los Angeles hourly weather for March 1.” While that’s an extremely simplified explanation, the fact is that Google, Bing, Facebook, and other internet leaders have been working on what Colin Jeavons calls “the silent semantic revolution” for years now. Bing launched Satori, a knowledge storehouse that’s capable of understanding complex relationships between people, things, and entities. Facebook built Knowledge Graph, which reveals additional information about things you search, based on Google’s complex semantic algorithm called Hummingbird.

Yep, a new age dawned. The message in the article is that marketers have a great new opportunity to push their message in front of users. In my book, this is one reason why running a query on any of the ad supported Web search engines returns so much irrelevant information. In my just submitted Information Today column, I report how a query for the phrase “concept searching” returned results littered with a vendor’s marketing hoo-hah.

I did not want information about a vendor. I wanted information about a concept. But, alas, Google knows what I want. I don’t know what I want in the brave new world of search. The article ignores the lack of relevance in results, the dust binning of precision and recall, and the bogus information many search queries generate. Try to find current information about Dark Web onion sites and let me know how helpful the search systems are. In fact, name the top TOR search engines. See how far you get with Bing, Google, and Yandex. (DuckDuckGo and Ixquick seem to be aware of TOS content by the way.)

So semantic in the context of this article boils down to four points:

  1. Think like an end user. I suppose one should not try to locate an explanation of “concept searching.” I guess Google knows I care about a company with a quite narrow set of technology focused on SharePoint.
  2. Invest in semantic markup. Okay, that will make sense to the content marketers. What if the system used to generate the content does not support the nifty features of the Semantic Web. OWL, who? RDF what?
  3. Do social. Okay, that’s useful. Facebook and Twitter are the go to systems for marketing products I assume. Who on Facebook cares about cyber OSINT or GE’s cratering petrochemical business?
  4. And the keeper, “Don’t forget about standard techniques.” This means search engine optimization. That SEO stuff is designed to make relevance irrelevant. Great idea.

Net net: The write up underscores some of the issues associated with generating buzz for a small business like the ones INC Magazine tries to serve. With write ups like this one about Semantic Search, INC may be confusing their core constituency. Can confused executives close deals and make sense of INC articles? I assume so. I know I cannot.

Stephen E Arnold, March 27, 2015

Partnership Between Twitter and IBM Showing Results

March 27, 2015

The article on TechWorld titled IBM Boosts BlueMix and Watson Analytics with Twitter Integration investigates the fruits of the partnership between IBM and Twitter, which began in 2014. IBM Bluemix now has Twitter available as one the services available in the cloud based developer environment. Watson Analytics will also be integrated with Twitter for the creation of visualizations. Developers will be able to grab data from Twitter for better insights into patterns and relationships.

“The Twitter data is available as part of that service so if I wanted to, for example, understand the relationship between a hashtag on pizza, burgers or tofu, I can go into the service, enter the hashtag and specify a date range,” said Rennie. “We [IBM] go out, gather information and essentially calculate what is the sentiment against those tags, what is the split by location, by gender, by retweets, and put it into a format whereby you can immediately do visualisation.”

From the beginning of the partnership, Twitter gave IBM access to its data and the go-ahead to use Twitter with the cloud based developer tools. Watson looks like a catch all for data, and the CMO of Brandwatch Will McInnes suggests that Twitter is only the beginning. The potential of data from social media is a vast and constantly rearranging field.

Chelsea Kerwin, March 27, 2015

Stephen E Arnold, Publisher of CyberOSINT at

IBM Methods Are Alive and Well: Google Goes with Lock In for Search in Africa

March 25, 2015

I assume the information in “Facebook and Google Are Locking In African Customers with Freebie Deals” is accurate. Let me  be upfront. I don’t worry too much about Facebook. Folks using that service make a decision to post information, build friend lists, and do other social functions.

Search is different. A person enters a query and assumes, maybe believes, that the results are objective, accurate, and related to the query itself. I am not sure this utopia exists or that most users, even with graduate degrees, can figure out the difference between information, disinformation, misinformation, or reformation of information.I know it takes considerable work. To see the depth of the problem, run a query for the seemingly innocuous phrase “concept searching.” Check out the results. Nifty, eh?

The article states:

Both companies [Facebook and Google] are rolling out programs in some African countries that give people free internet access—but the complimentary access is contingent on people using their services. Their large-scale world-connectivity projects are tailored to ensure that Facebook and Google become the go-to on-ramps for accessing Internet.

Is the objective market control for the purpose of generating revenue?

The story explains:

It’s hard for companies to compete with Facebook and Google in the US; in Africa, where these tech giants will have a huge leg up on local competitors, it will be even harder. By establishing themselves as home bases for the internet, Facebook and Google are elbowing control over the online experiences of a continent away from would-be domestic entrepreneurs and local startups.

Perhaps Facebook and Google will merge, sort of a Kraft and Heinz type deal. That will provide even more freebies to the markets in Africa, right? Is this article getting close to explaining how a Belgium-type deal was such a plus for the Congo? Absolutely not. The parallels are specious. Neither Facebook or Google is a monarchy. Neither Facebook or Google are interested in natural resources? Neither Facebook or Google wishes to prevent others from serving the markets in Africa.

This is just great marketing for those with a dog in the fight, especially advertisers.

Stephen E Arnold, March 25, 2015

Elasticsearch Becomes Elastic, Acquires Found

March 25, 2015

The article on titled Elasticsearch Changes Its Name, Enjoys An Amazing Open Source Ride and Hopes to Avoid Mistakes explains the latest acquisition and the reasons behind the name change to simply Elastic. That choice is surmised to be due to Elastic’s wish to avoid confusion over the open source product Elasticsearch and the company itself. It also signals the company’s movement beyond solely providing search technology. The article also discusses the acquisition of Found, a Norwegian company,

“Found provides hosted and fully ­managed Elasticsearch clusters with technology that automates processes such as installation, configuration, maintenance, backup, and high­availability. Doing all of this heavy-lifting enables developers to integrate a search engine into their database, website or app quickly In addition, Found has created a turnkey process to scale Elasticsearch clusters up or down at any time and without any downtime. Found’s Elasticsearch as a Service offering is being used by companies like Docker, Gild… and the New York Public Library.”

Elasticsearch has raised almost $105 million since its start after being created by Shay Banon in 2010. The article posits that they have been doing the right things so far, such as the acquisition of Kibana, the visualization vendor. Although some startups relying on Elasticsearch may throw shade at the Found acquisition, there are no foreseeable threats to Elastic’s future.

Chelsea Kerwin, March 25, 2015

Stephen E Arnold, Publisher of CyberOSINT at

The Ins and Outs of the Black Market Economy

March 24, 2015

The article titled The Cybercrime Economy: Welcome To The Black Market of The Internet on ZeroFox discusses the current state of the black market and the consequences of its success. The author delves into the economy of the black market, suggesting that it, too, is at the mercy of supply and demand. Some of the players in the structure of the black market include malware brokers, botnet “herders,” and monetization specialists. The article says,

“So what’s the big deal — how does this underground economy influence the economy we see day to day? The financial markets themselves are highly sensitive to the impact of cyber crime… Additionally, fluctuating bitcoin markets (which affects forex trades) and verticals that can be affected through social engineering (the Fin4 example) are both targets for exploitation on a mass scale….There is a good reason cyber security spending surpassed 70 billion in 2014: breaches are costly. Very costly.”

As for how to upset the economy of the black market, the article posits that “cutting off the head” will not work. Supply and demand keep the black market running, not some figurehead. Instead, the article suggests that the real blame lies on the monopolies that drive up prices and force consumers to look for illegal options.

Chelsea Kerwin, March 24, 2015

Stephen E Arnold, Publisher of CyberOSINT at

Data and Marketing Come Together for a Story

March 23, 2015

An article on the Marketing Experiments Blog titled Digital Analytics: How To Use Data To Tell Your Marketing Story explains the primacy of the story in the world of data. The conveyance of the story, the article claims, should be a collaboration between the marketer and the analyst, with both players working together to create an engaging and data-supported story. The article suggests breaking this story into several parts, similar to the plot points you might study in a creative writing class. Exposition, Rising Action, Climax, Denouement and Resolution. The article states,

“Nate [Silver] maintained throughout his speech that marketers need to be able to tell a story with data or it is useless. In order to use your data properly, you must know what the narrative should be…I see data reporting and interpretation as an art, very similar to storytelling. However, data analysts are too often siloed. We have to understand that no one writes in a bubble, and marketing teams should understand the value and perspective data can bring to a story.”

Silver, Founder and Editor in Chief of is also quoted in the article from his talk at the Adobe Summit Digital Marketing Conference. He said, “Just because you can’t measure it, doesn’t mean it’s not important.” This is the back to the basics approach that companies need to consider.

Chelsea Kerwin, March 23, 2015

Stephen E Arnold, Publisher of CyberOSINT at

IDC and Forrester: New Partnership, New Confusion among Mid-Tier Consultants?

March 17, 2015

I received this interesting email this morning (March 16, 2015).


Notice the logo. The email is from IDG Connect based in Staines, Middlesex, UK. Now look at this headline:

Acquia Identified as a “Strong Performer” in The Forrester Wave™: Web Content Management Systems, Q1 2015

When I clicked on the hyperlink in the title of the email, look what I found:


This is a list of the promotions IDC seems to have done for its competitors. I thought that “real” consultants did not cross over into the pastures of other consulting firms. Obviously I am incorrect in this assumption.

That leaves me with the hypothesis that IDC is promoting Forrester’s “wave”—the me-too to Gartner’s Magic Boston Consulting Group Variant Quadrant without the Analytics—for content management.

Whoa, Nellie. I thought that IDC was one outfit, happily placed in Boston, America’s first city. Emerson, witch burning, Route 128, and the Big Dig. The marketing arm of IDG for this email comes from merrie old England. Is this content marketing and information shaping at a fairly interlocking level? What else do these mid tier consulting firms share? Client lists? Client problems? Content used without the permission of people like me who write stuff and then have it repurposed under a so-called expert’s name. (Yep, Dave Schubmehl again.)

Here’s what the email to me said:

We are pleased to announce that Acquia is a “Strong Performer” in The Forrester Wave™: Web Content Management Systems, Q1 2015. “Acquia’s standout features include the cloud strategy, solid content management and delivery functionality, and a strong developer community and component ecosystem.” Like Forrester, we believe that digital experience delivery is the strategic technology investment for every brand. Ready to see the results? Download your complimentary copy today.

As you may know, IDC’s Dave Schubmehl (remember him, the search wizard) sold my content on Amazon. Now it appears that I can get a free report via the IDC email for the ordinarily “real money” research from Forrester. Confused yet?

When I clicked on the Click Here button I received a copy of Ted Schadler’s “The Forrester Wave: Web Content Management Systems, Q1 2015 report. Here’s the link. Give it a while, but not promises:

Who’s Forrester’s Ted Schadler? I have his photo.


A quick check online reveals that Ted Schadler has a Forrester blog called “Ted Schadler’s Blog.” He covers quite a few topics; for example, broadband, Web content management, Internet regulation, free Web publishing systems, and experience gaps. He is a bit of a Leonardo it seems, and he has some marketing in his DNA too.

Intrigued, I ran a query for his name and IDG/IDC. Mr. Schadler has been a speaker at IDG’s tony CIO conference. Mr. Schadler’s topic was described this way:

Collaboration across the C-suite and the challenge of transitioning to a digital enterprise will have a special focus at this year’s event, which also features opening keynotes by Tom Davenport, author of “big data @ work,” and Forrester Research’s Ted Schadler, author of “The Mobile Mind Shift.” The symposium concludes with an awards ceremony recognizing the 2014 CIO 100 winners.

I will try to keep track of what mid tier (what I call an azure chip) consulting firm is promoting other of the same ilk.

If I were paying for one consulting firm, would I check to make sure my firm’s private information does not leak? I sure would.

Stephen E Arnold, March 17, 2015

HP Autonomy Blog: Everything but Autonomy Technology

March 10, 2015

I have been checking out the search and content processing vendors who have gone quiet. In my lingo, “quiet” means the company outputs little or no news in the form of blog posts, news releases, slide decks on Slideshare, etc.

One of the most aggressive and effective marketing outfits in search and content processing was Autonomy. since the HP deal, the majority of the Autonomy related news concerns the litigation between HP and Autonomy about HP’s purchase of Autonomy.

I checked links to the Autonomy blog on and clicked on the link at the top of this page:


The link is dead if this message is correct:


I then navigated to the GOOG and ran the query “Autonomy blog.” The first link pointed me to this page:


The only hurdle I encountered in my fly over was that the information is not “about” Autonomy, IDOL, or the Digital Reasoning Engine.

Perhaps I am overlooking HP’s brilliant marketing, but it seems to me that HP is not making much of an effort to take a page from Autonomy’s marketing plan book. That might be a mistake in some niches.

When a company goes quiet, I interpret the behavior as a signal about management resolve, financial resources, or having something substantive to communicate. Call me old fashioned, but I like a stream of information about sales, enhancements, bug fixes, and other artifacts of a growing company.

Stephen E Arnold, March 10, 2015

Smartlogic and SmartLogic: Brand Clash

February 26, 2015

I am a simple person, gliding slowly into the assisted living facility. I know I cannot keep up with the management wizards in the search and content processing sectors. (I do bristle when “experts” address parental instructions toward me in their LinkedIn posts.)

I ran a query for SmartLogic. The Smartlogic I know a bit about is an outfit that performs automated indexing. The company’s hook is “the content intelligence company.” The idea is that if a document is indexed, then the content becomes smarter. This is a claim I have heard repeated from prescient thinkers like Dr. Ron Sacks Davis, the person making possible the TeraText system. Dr. Sacks Davis floated this idea in 1975. Down the line, CALS and then SMGL advocates pitched the advantages of tagging structural elements, stuffing the components and the tags into a database, and discovering the joys of scripted content slicing and dicing. In the modern era, many companies, including Smartlogic, have dusted off the intelligent content moniker as a way to generate interest in automated index, the joys of taxonomies, and slipping a data management system into a company under the cover of metadata. LinkedIn experts are thrashing about this Trojan Horse maneuver as I write this blog item.

Run a query for Smartlogic, however, and one sees that there are two Smartlogics. One uses a lower case “l” in its spelling; the other, an upper case “L.” When I run the query for “smartlogic” on Google, this is what the GOOG displays:


Yep, two Smartlogics. One has a dot com domain and the other a dot io domain. The big “L” outfit is doing a much better job of getting its brand into the various electronic media. When i run the query “smartlogic Baltimore”, the heavens open and rain links to helping other companies, writing software, and making the Baltimore business scene vibrant.

Here’s the newer (upper case “L”) Pretty snappy design I would suggest.


About one year ago, the content intelligence flavor of Smartlogic was the Big Dog in the Google index. Today, not so much. Here’s what the indexing Smartlogic’s Web site  (lower case “l”) looks like on February 25, 2015:


Understated in comparison to the upper case “L” outfit I perceive.

Questions I formulated are:

  • How has Smartlogic marketing squandered its grip on the name “smartlogic”?
  • How is the company dominating social media?
  • What happens to Web site traffic and over the transom questions from potential customers who want indexing and end up looking at a services firm in Baltimore?

When search vendors lose control of their brand, I often hear, as I did from Brainware before it was acquired by Lexmark, “You cannot provide links to videos via the keyword “brainware.” The videos are inappropriate.” Mismanaging a company name is my fault?

Get real, Brainware.

I see this erosion when I search for Connotate, Thunderstone, and now Smartlogic and others. I track this via my public Overflight pages.

Fascinating insight into what content processing executives perceive as important.

Stephen E Arnold, February 26, 2015

Antidot Semi-Pivot to eCommerce Search

February 21, 2015

I wanted to capture Antidot’s semi pivot from enterprise search to eCommerce search. The French company provides a useful description of its afs@store product. If you bang this product name into the GOOG, you find that the American Foundry Society, Associated Food Stores, and the American Fisheries Society push Antidot’s product down the results list. In general, names of search and content processing systems often disappear into search results. Perhaps Antidot has a way to make the use of the “@” sign somewhat less problematic.

The system, according to Antidot, system delivers features that sidestep the unsticky nature of most eCommerce customer visits. Antidot asserts:

  • Rich, tolerant and customizable auto complete featuring products, brands, categories…
  • Fully typo-tolerant search
  • Semantic search that understands your customer’s words
  • Dynamic filtering facets to rapidly select desired products
  • Web interface to simply monitor and manage your searchandising

the company offers a plug in for Magento, the open source eCommerce system, that enjoyed love from eBay. It is difficult to know if that love is growing stronger with time, however.

I did notice that the “See and read more” panel had zero information and no links. Hopefully this void will be addressed.

Stephen E Arnold, February 21, 2015

Next Page »