CyberOSINT banner

SLI Systems Still Struggling

May 26, 2015

Early this year, we reported on the sudden personnel shift over at e-commerce search firm SLI Systems. Now, New Zealand’s National Business Review reports, “SLI Systems Says Second-Half Revenue Will Miss Expectations on Weaker American Sales.” It seems the staff shake-up led to disappointing sales, but the company is confident they will make up ground later this year, after the dust settles. They also cite a weak economy in Brazil as a limiting factor. Reporter Tina Morrison writes:

“Operating revenue will rise to $28 million in the year ending June 30, from $22 million a year earlier, the Christchurch-based company said in a statement. The forecast is lower than the $30.5 million expected by analysts in a Reuters poll. …

“The company is forgoing profits and dividends to fund growth in  the expanding e-commerce market, particularly in the US, and says its software as a service is the second biggest after Oracle to provide online retailers with suggestive search engines. Analysts polled by Reuters before today’s announcement had expected the company’s annual loss to widen to $7 million this year, from $5.7 million last year. It expects to report its annual earnings in late August.”

Founded in 2001, SLI Systems now powers e-commerce on over 800 websites. The company is based in Christchurch, New Zealand, and maintains offices in San Jose, California; London; Melbourne; and Tokyo. Anyone who thinks they can help the company bounce back should note that (as of this writing) SLI is looking for new Sales Directors in Melbourne and San Jose.

Cynthia Murrell, May 26, 2015

Sponsored by, publisher of the CyberOSINT monograph

IBM Watson: 75 Industries Have Watson Apps. What about Revenue from Watson?

May 25, 2015

Was it just four years ago? How PR time flies. I read “Boyhood.” Now here’s the subtitle, which is definitely Google-licious:

Watson was just 4 years old when it beat the best human contestants on Jeopardy! As it grows up and goes out into the world, the question becomes: How afraid of it should we be?

I am not too afraid. If I were the president of IBM, I would be fearful. Watson was supposed to be well on its way north of $1 billion in revenue. If I were the top wizards responsible for Watson, I would be trepedatious . If I were a stakeholder in IBM, I would be terrified.

But Watson does not frighten me. Watson, in case you do not know, is built from:

  1. Open source search
  2. Acquired companies’ technology
  3. Home brew scripts
  4. IBM bit iron

The mix is held together with massive hyperbole-infused marketing.

The problem is that the revenue is just not moving the needle for the Big Blue bean counters. Please, recall that IBM has reported dismal financial results for three years. IBM is buying back its stock. IBM is selling its assets. IBM is looking at the exhaust pipes of outfits like Amazon. IBM is in a pickle.

The write up ignores what I think are important factoids about IBM. The article asserts:

The machine began as the product of a long-shot corporate stunt, in which IBM engineers set out to build an artificial intelligence that could beat the greatest human champions at Jeopardy!, one that could master language’s subtleties: rhymes, allusions, puns….It has folded so seamlessly into the world that, according to IBM, the Watson program has been applied in 75 industries in 17 countries, and tens of thousands of people are using its applications in their own work. [Emphasis added]

How could I be skeptical? Molecular biology. A cook book. Jeopardy.

Now for some history:

Language is the “holy grail,” he said, “the reflection of how we think about the world.” He tapped his head. “It’s the path into here.”

And then the epiphany:

Watson was becoming something strange, and new — an expert that was only beginning to understand. One day, a young Watson engineer named Mike Barborak and his colleagues wrote something close to the simplest rule that he could imagine, which, translated from code to English, roughly meant: Things are related to things. They intended the rule as an instigation, an instruction to begin making a chain of inferences, each idea leaping to the next. Barborak presented a medical scenario, a few sentences from a patient note that described an older woman entering the doctor’s office with a tremor. He ran the program — things are related to things — and let Watson roam. In many ways, Watson’s truest expression is a graph, a concept map of clusters and connective lines that showed the leaps it was making. Barborak began to study its clusters — hundreds, maybe thousands of ideas that Watson had explored, many of them strange or obscure. “Just no way that a person would ever manually do those searches,” Barborak said. The inferences led it to a dense node that, when Barborak examined it, concerned a part of the brain…that becomes degraded by Parkinson’s disease. “Pretty amazing,” Barborak said. Watson didn’t really understand the woman’s suffering. But even so, it had done exactly what a doctor would do — pinpointed the relevant parts of the clinical report, discerned the disease, identified the biological cause. To make these leaps, all you needed was to read like a machine: voraciously and perfectly.

I have to take a break. My heart is racing. How could this marvel of technology be used to save lives, improve the output of Burger King, and become the all time big winner on the the Price Is Right?

Now let’s give IBM a pat on the back for getting this 6.000 word write up in a magazine consumed by those who love the Big Apple without the New Yorker’s copy editors poking their human nose into reportage.

From my point of view, Watson needs to deliver:

  1. Sustainable revenue
  2. Demonstrate that the system can be affordable
  3. Does not require human intermediaries to baby sit the system
  4. Process content so that real time outputs are usable by those needing “now” insights
  5. Does not make egregious errors which cause a human using Watson to spend time shaping or figuring out if the outputs are going to deliver what the user requires; for example, a cancer treatment regimen which helps the patient or a burrito a human can enjoy.

Hewlett Packard and IBM have managed to get themselves into the “search and content processing” bottle. It sure seems as if better information outputs will lead to billions in revenue. Unfortunately the realty is that getting big bucks from search and content processing is very difficult to do. For verification, just run a query on Google News with these terms: Hewlett Packard Autonomy.

The search and content processing sector is a utility function. There are applications which can generate substantial revenue. And it is true that these vendors include search as a utility function.

But pitching smart software spitballs works when one is not being watched by stakeholders. Under scrutiny, the approach does not have much of a chance. Don’t believe me? Take your entire life savings and buy IBM stock. Let me know how that works out.

Stephen E Arnold, May 25, 2015

Keeping Track of Rockets

May 23, 2015

I have an Overflight for AeroText, which is located at 77 Fourth Avenue, in Waltham, Massachusetts. The company offers a search system. I noted that Rocket Software is located at 77 4th Avenue in Waltham, Massachusetts. Ergo: AeroText is now Rocket Software.

What makes this interesting to me is that Overflight snagged a number of references to a software component causing some consternation. I ran a query for “rocket search” on the GOOG and noted these results:


What jumps out is that there are no links to the Waltham-based outfit and there are quite a few links to information about removing what one outfit (ScarebearSoftware) called a virus. The software in question is “Rocket Search.”

My point is that vendors of search and content processing software have to name their products so that individuals interested in legitimate content processing systems can actually find the company.

In the past, I have commented about Brainware being usurped by an outfit keen to pump YouTube videos out with corresponding erosion of the Brainware “brand.” Brainware is not part of Lexmark, and I don’t think too many folks remember Brainware, trigrams, and the convoluted history of the company. Thunderstone in Cleveland has suffered a similar fate. Thunderstone is for all intents and purposes now associated with games, not search. And there are other examples.

The most recent instance of a vendor losing control of a brand was, until now, Smartlogic. An outfit in Baltimore has encroached on the conceptual real estate and Smartlogic’s Semaphore product name is now lovingly gazed upon by a German outfit with a variant of Smartlogic’s product moniker Semafora at

Now Rocket Software, a company eager to become a mover and shaker in search, faces the malware and virus association.

How does one remediate this problem. First, vendors have to pay attention to the name itself. Second, search vendors have to protect their “semantic real estate.” Third, search vendors have to communicate meaningful, high value information.

Ignoring these suggestions leads to brand erosion. Who can license a product if it cannot be found in Bing, Google, or Yandex? Augmentext can help remediate this type of problem, but it is easier and cheaper to head off invisibility and confusion before they gallop through the indexes churning up semantic mud.

I assume it is difficult to see a path forward when there is spatter on one’s eye glasses.

Stephen E Arnold, May 23, 2015

Decrease from TrueVue

May 21, 2015

The article on Business Insider titled Google Has a New and Unexpected Explanation for Its Falling Ad Rates places the blame on Youtube’s “TrueView” video ads. For some time there has been concern over Google’s falling cost-per-click (CPC) money, the cash earned each time a user clicks on an ad. The first quarter of this year has CPC down 7%. The article quotes outgoing Google CFO Patrick Pichette on the real reason for these numbers. He states,

“TrueView ads currently monetize at a lower rate than ad clicks on  As you know, video ads generally reach people earlier in the purchase funnel, and so across the industry, they tend to have a different pricing profile than that of search ads,” Pichette explained. “Excluding the impact of YouTube TrueView ads, growth in Sites clicks would be lower, but still positive and CPCs would be healthy and growing Y/Y,” Pichette continued.

It is often thought that the increasing dependence on mobile internet access through smartphones is the reason for falling CPC. Google can’t charge as much for mobile ads as for PC ads, making it a logical leap that this is the area of concern. Pichette offers a different view, and one with an entirely positive spin.
Chelsea Kerwin, May 21, 2014

Stephen E Arnold, Publisher of CyberOSINT at


Glass: A Family

May 18, 2015

Forget YouTube search, which is allegedly going to get better real soon.

Navigate to “Google Glass Tipped to Become Product Family.” I learned:

We also get some clues from the new Glass team description, which reads: “The Google Glass division is a world-class team focused on the cutting edge of hardware, software, and industrial design.” It continues: “It is charged with pioneering, developing, building, and launching smart eyewear and other related products in line with Google’s ambitious and visionary objectives.

If I have an Apple Watch and ride around in an autonomous auto or snag an Uber ride, tell me again why I need to wear another gadget over my trifocals. How will that work? I suppose I can wear the new design instead of my trifocals. Wait! Then I would not be able to see my Apple Watch or the Uber car. I am excited.

Stephen Arnold, May 18, 2015

Connotate Reveals There Are One Billion Web Sites

May 16, 2015

I did not know there were one billion Web sites. Here’s the Web page on Connotate’s Web site which puts me in the know:



The figure has been bandied about by Internet Live States, Business Insider, and the Daily Mail. This number was hit in late 2014 and confirmed by “the inventor of the Internet.” I noted that no one asked Google, an outfit which has a reasonable log file of its crawling activities. Doesn’t Google “know” a number? If the GOOG does, it is not talking or maybe the company is not returning phone calls from people asking, “How many Web sites make up the Internet?”

I navigated to Internet Live Stats on May 16, 2015, and noted this item of information:


I don’t want to rain on the parade, but the number is 900 million and apparently growing. Apparently the “number” can vary. Internet Live Stats says:

We do expect, however, to exceed 1 billion websites again sometime in 2015 and to stabilize the count above this historic milestone in 2016.

So what? Frankly the one billion number is irrelevant to me. What is relevant is that a company is using what I suppose is a sketchy number as a way to capture business is a good example of the marketing used by search and content processing vendors.

I know that generating organic, sustainable revenue from search and content processing, information access, and indexing software is very difficult.

The number of Web sites does not mean much, if anything. In an interview with BrightPlanet, I learned that savvy customers narrow the focus of their content acquisition and analysis. Less, it seems to me, may be more. Also, Darpa’s MEMEX project is designed to figure out the width, depth, and breadth of the Dark Net. Is it larger or smaller than the Clear Net?

I prefer value propositions and “marketing hooks” that do not equate size with importance or trigger the fear of not knowing what’s out there? But if it works, it is definitely okay in today’s pressurized sales environment.

There are a billion crazy search and content marketing assertions. Wait. Make that two billion.

Stephen E Arnold, May 16, 2015

HP Idol and Hadoop: Search, Analytics, and Big Data for You

May 16, 2015

I was clicking through links related to Autonomy IDOL. One of the links which I noted was to a YouTube video labeled “HP IDOL for for Hadoop: Create a Smarter Data Lake.” Hadoop has become a simile for making sense of Big Data. I am not sure what Big Data are, but I assume I will know when my eight gigabyte USB key cannot accept another file. Big Data? Doesn’t it depend on one’s point of view?

What is fascinating about the HP Idol video is that it carries a posting date of October 2014, which is in the period when HP was ramping up its anti-Autonomy legal activities. The video, I assumed before watching, would break from the Autonomy marketing assertions and move in a bold, new direction.

The video contained some remarkable assertions. Please, watch the video yourself because I may have missed some howlers as I was chuckling and writing on my old school notepad with a decidedly old fashioned pencil. Hey, these tools work, which is more than I can say for some of the software we examined last week.

Here’s what I noted with the accompanying screenshot so you can locate the frame in the YouTube video to double check my observation with the reality of the video.

First, there is the statement that in an organization 88 percent of its information is “unanalyzed.” The source is a 2012 study from Forrsights Strategy Spotlight: Business Intelligence and Big Data. Forrester, another mid tier consulting firm, produces these reports for its customers. Okay, a couple of years old research. Maybe it is valid? Maybe not? My thought was that HP may be a company which did not examine the data to which it had access about Autonomy before it wrote a check for billions of dollars. I assume HP has rectified any glitch along this line. HP’s litigation with Autonomy and the billions in write down for the deal underscore the problem with unanalyzed data. Alas, no reference was made to this case example in the HP video.

Second, Hadoop, a variant of Google’s MapReduce technology, is presented as a way to reap the benefits of cost efficiency and scalability. These are generally desirable attributes of Hadoop and other data management systems. The hitch, in my opinion, is that it is a collection of projects. These have been developed via the open source / commercial model. Hadoop works well for certain types of problems. Extract, transform, and load works reasonably well once the Hadoop installation is set up, properly resourced, and the Java code debugged so it works. Hadoop requires some degree of technical sophistication; otherwise, the system can be slow, stuffed with duplicates, and a bit like a Rube Goldberg machine. But the Hadoop references in the video are not a demonstration. I noted this “explanation.”


Third, HP jumps from the Hadoop segment to “what if” questions. I liked the “democratize Big Data” because “Big Data Changes everything.” Okay, but the solution is Idol for Hadoop. The HP approach is to create a “smarter data lake.” Hmmm. Hadoop to Idol to data lake for the purpose of advanced analytics, machine learning functions, and enterprise level security. That sounds quite a bit like Autonomy’s value proposition before it was purchased from Dr. Lynch and company. In fact, Autonomy’s connectors permitted the system to ingest disparate types of data as I recall.

Fourth, the next logical discontinuity is the shift from Hadoop to something called “contextual search.” A Gartner report is presented which states with Douglas McArthur-like confidence:

HP Idol. A leader in the 2014 Garnter Magic Quadrant for Contextual Search.

What the heck is contextual search in a Hadoop system accessed by Autonomy Idol? The answer is SEARCH. Yep, a concept that has been difficult to implement for 20, maybe 30 years. Search is so difficult to sell that Dr. Lynch generated revenues by acquiring companies and applying his neuro-linguistic methods to these firms’ software. I learned:

The sophistication and extensibility of HP Autonomy’s Intelligent Data Operating Layer (Idol) offering enable it to tackle the most demanding use cases, such as fraud detection and search within large video libraries and feeds.

Yo, video. I thought Autonomy acquired video centric companies and the video content resided within specialized storage systems using quite specific indexing and information access features. Has HP cracked the problem of storing video in Hadoop so that a licensee can perform fraud detection and search within video libraries. My experience with large video libraries is that certain video like surveillance footage is pretty tough to process with accuracy. Humans, even academic trainees, can be placed in front of a video monitor and told, “Watch this stream. Note anomalies.” Not exciting but necessary because processing large volumes of video remains what I would describe as “a bit of a challenge, grasshopper.” Why is Google adding wild and crazy banners, overlays, and required metadata inputs? Maybe because automated processing and magical deep linking are out of reach? HP appears to have improved or overhauled Autonomy’s video analysis functions, and the Gartner analyst is reporting a major technical leap forward. Identifying a muzzle flash is different from recognizing a face in a flow of subway patrons captured on a surveillance camera, is it not?


I have heard some pre HP Autonomy sales pitches, but I can’t recall hearing that Idol can crunch flows of video content unless one uses the quite specialized system Autonomy acquired. Well, I have been wrong before, and I am certainly not qualified to be an analyst like the ones Gartner relies upon. I learned that HP Idol has a comprehensive list of data connectors. I think I would use the word “library,” but why niggle?

Fifth, the video jumps to a presentation of a “content hub.” The idea is that HP idol provides visual programming tools. I assume an HP Idol customer will point and click to create queries. The  queries will deliver outputs from the Hadoop data management system and the content which embodies the data lake. The user can also run a query and see a list of documents. but the video jumps from what strikes me as exactly what many users no longer want to do to locate information. One can search effectively when one knows what one is looking for and that the needed information is actually in the index. The use case appears to be health care and the video concludes with a reminder that one can perform advanced analytics. There is a different point of view available in this ParAccel  white paper.

I understand the strengths and weaknesses of videos. I have been doing some home brew videos since I retired. But HP is presenting assertions about Autonomy’s technology which seem to be out of step with my understanding of what Idol, the digital reasoning engine, Autonomy’s acquired video technology.

The point is that HP seems to be out marketing Autonomy’s marketing. The assert6ions and logical leaps in the HP Idol Hadoop video stretch the boundaries of my credulity. I find this interesting because HP is alleging that Autonomy used similar verbal polishing to convince HP to write a billion dollar check for a search vendor which had grown via acquisitions over a period of 15 years.

Stephen E Arnold, May 16, 2015

MBAs Gone Wild: Is There a Video?

May 13, 2015

I find Harvard fascinating. As a poor student in a small, ignored also ran university, I bumped into the fancy university folks at debate tournaments. Our record against many fine institutions, such as Dartmouth and the other Big Dogs was okay.

We won. Lots.

Then there were the Big Dog MBAs at Booz, Allen & Hamilton. Sigh. Wall Street forced more of these “special ones” into my radar screen. Delightful. There was an MBAism for every issue. Maybe not a correction or appropriate solution, but there was jargon, generalizations, and entitlement thinking. In meeting after meeting, I reviewed nifty, but often meaningless or incoherent, graphs or diagrams. I did not count this work as “quality time.” Today, as the beloved and now departed Yogi Berra said, “It’s déjà vu all over again.

I read “Where the Digital Economy Is Moving the Fastest” and circled a remarkable diagram. True, the write up is not about search, but the spirit of search and content processing system vendors tinted my perception, using only pleasing and compatible calming colors.

The authors developed criteria for countries which are moving fast. Not product sales or market share, countries. The world but for nation states like Yemen and its ilk. Then, like good MBAs, crafted a matrix and plotted the fast movers, the losers, the ones to watch, and the maybes. Here’s the graphic:


There are “Watch Out” countries. I admit I interpreted this label in a manner different from the article’s sense of the phrase. I checked out the countries between Stall Out (dogs) and Stand Out (invest for sure maybe?). Poor Sweden, Britain, and Germany. Look at the countries on the move. Check out the tweeners: Brazil, Turkey, and the Russian Federation.

Now this map, the graphic, and the meaning of the “data” strikes me as less than useful

The diagram reminded me of other consulting firms’ matrices. I snagged this at random from Google.


What about this version from

Which makes more sense? The diagram from the Harvard Business Review, the diagram with lots of dots, or the dead simple diagram from Boston Consulting Group?

From my point of view, the BCG approach makes the most sense. BCG analyzed market share data, actual numbers. The diagram presents visual cues which related directly to the numerical data., The viewer of the diagram does not have to wonder why a specific company is in one box or why a specific country is a “break out.”

Whether analyzing countries or companies on essentially methods which do not tie directly to numbers, the diagrams raise more questions than they answer. The BCG matrix, which consultants at McKinsey and Booz, Allen & Hamilton envied when the BCG matrix became available in the 1970s was, “Wow, what a great diagram?”

What I believe has happened is that the value of the envy-generating BCG matrix is based in the data to which the quadrants tied because the focus was a specific product and its market share. If the share was increasing and lights were flashing green, then the product was a star.

The more recent versions of the matrix do the diagram, add complexity, and lack the quite specific analysis of numerical data which another person could analyze and, presumably, reach similar data-anchored observations. Math and data are helpful when properly combined.

In short, data and the BCG analysis make the BCG matrix an effective communication tool. Matrices without similar data rigor are artifacts of “experts” who want something that makes a sale possible.

BCG wanted to help its clients, not confuse them. MBAs, please, do not write me and tell me I don’t understand the sophistication of the methods underpinning these presentation ready diagrams. For me there is a gap between data-anchored graphics and subjective or opinion-based graphics.

Poetry and fiction are noble pursuits. Data may not exciting. But for me, an old school BCG matrix more satisfying as long as there are verifiable data unpinning the items mapped to the matrix.

Stephen E Arnold, May 23, 2015

IBM and the Watson Burrito: A Semi Classic from Armonk

May 13, 2015

A podcaster used the phrase “emotional burrito.” I liked the connotation. IBM has crafted a burrito recipe. I by chance saw a link to a story about the IBM Watson burrito. “IBM’s Watson Designed The Worst Burrito I’ve Ever Had” explains a recipe generated by IBM’s smart system. Keep in mind that Watson is now helping physicians treat cancer.

According to the article:

you zest a whole orange into a skillet of ground beef with a pinch of cinnamon. You puree some plain edamame as a sort of salsa. And you reduce apricot puree, vanilla bean, and dark chocolate to create a sauce. Then you top off this mixture with some cotija cheese. Well that’s what the recipe said. Then I noticed an asterisk. Apparently Watson had originally suggested cheese curds, while the human chef in charge of this recipe, ICE Creative Director Michael Laiskonis, had opted for a more traditional cotija cheese. I figured, when one of the smartest AI engines on the planet tells you to eat more cheese curds, you eat more cheese curds. So I opted for cheese curds.

The author a cooking wizard followed IBM’s recipe. The author says:

I prepared to roll $32 worth of groceries into a warm tortilla shell, I doubled my pour of sake, and soldiered on following General Watson to burrito town.

So how did this discovery by Watson work out?

My meal wasn’t good. In fact, it was pretty bad. If I’d made this for my family, I’d apologize and suggest we order a pizza.

I get my burritos at the lone semi-authentic restaurant in Harrod’s Creek. I think the chef microwaves burritos from Kroger. They are at least burritos. IBM’s PR department is not reassuring me about Watson’s open source code, home brew scripts, and software from acquired companies.

My concern with the Watson burrito is a question, “What happens if Watson’s cancer treatment is like this chocolate, edamame apricot salsa, and cheese burrito?”

Do the docs and nurses order a pizza? Hold the apricots.

Stephen E Arnold, May 13, 2015

The Philosophy of Semantic Search

May 13, 2015

The article Taking Advantage of Semantic Search NOW: Understanding Semiotics, Signs, & Schema on Lunametrics delves into semantics on a philosophical and linguistic level as well as in regards to business. He goes through the emergence of semantic search beginning with Ray Kurzweil’s interest in machine learning meaning as opposed to simpler keyword search. In order to fully grasp this concept, the author of the article provides a brief refresher on Saussure’s semantics.

“a Sign is comprised of a signifier, or the name of a thing, and the signified, what that thing represents… Say you sell iPad accessories. “iPad case” is your signifier, or keyword in search marketing speak. We’ve abused the signifier to the utmost over the years, stuffing it onto pages, calculating its density with text tools, jamming it into title tags, in part because we were speaking to robot who read at a 3-year-old level.”

In order to create meaning, we must go beyond even just the addition of price tag and picture to create a sign. The article suggests the need for schema, in the addition of some indication of whom and what the thing is for. The author, Michael Bartholow, has a background in linguistics and marketing and search engine optimization. His article ends with the question of when linguists, philosophers and humanists will be invited into the conversation with businesses, perhaps making him a true visionary in a field populated by data engineers with tunnel-vision.

Chelsea Kerwin, May 13, 2014

Sponsored by, publisher of the CyberOSINT monograph

Next Page »