CyberOSINT banner

Governance for Big Data. A Sure Fire Winner for Consultants

July 28, 2016

I read “What’s Next for Big Data Analytics?” I didn’t know the answer to this question, and I still don’t. The angle of attack is common sense. Companies with experience is dealing with digital information often have viewpoints different from the marketing collateral produced by their colleagues. This write up seems to fall in the category of Mr. Bush’s request, “Please, clap.”

The idea is that an organization has to have information policies. That sounds like consultant speak. Most organizations struggle to figure out what their company party policies are. Digital data policies are one of those tasks that senior managers allow others to wrestle to the ground and get a tap out.

The write up includes a number of diagrams. I highlighted this one:


The red area is the governance and management thing. Good luck with that. Companies need revenue. Big Data is supposed to deliver. If not, those policies and governance meeting minutes along with the consultants who billed big bucks for them are going to the shredder in my opinion.

Stephen E Arnold, July 28, 2016

Weakly Watson: IBM Watson and a Department Store

July 28, 2016

I have fond memories of selling men’s shirt at a department store in Illinois when I was a wee, thin lad. At the end of the year, the place was jammed with people. On a slow day, there were three or more folks riffling through the men’s shirts. Department stores have fallen on hard times. There is the Amazon thing. Social media sites like the wants to become a storefront.

Macy’s is a well known vendor. We have one in Louisville, Kentucky, but I think my last visit was in 2011. Too much hassle with the parking, the traffic, and the clutter in the men’s department.

Macy’s labeled its 2015 Annual Report and its 2016 Fact Book with the title “The Agility to Adapt.” I love marketing mantras and the lingo of MBAs. Yep, adapt. Macy’s seems to be struggling to generate sustainable top line revenue and healthy profits. My take on the company’s financial performance is that flat lines suggest mucho efficiency think. At some point, Macy’s has to find a way to jump start growth.

I read “Macy’s Taps IBM Watson to Improve In-Store Shopping App.” The idea is:

The retailer will use Watson’s machine-learning and cognitive-computing technology to assist shoppers as they wander through Macy’s department stores

Millennials, let’s assume, love apps. (I think apps are a bit of disappointment for some folks.) Millennials love to shop (I think online appeals to some of these fine lads and lasses.) Millennials love their smartphones. (I know this is a fact because two of them bumped into me as I walked into an eatery yesterday.)

What could be better? A retailer and Big Blue?

The write up informs me:

The app will apply Watson’s natural language processing (via its Natural Language Classifier API) in order to let shoppers ask questions like “Where can I find the swimsuits?”, and then it’ll find answers based each store’s unique products, services, and layout. Navigation is being provided by Satisfi’s location-based software, which accesses Watson’s technology from the cloud to make the whole experience come together. As time goes on, the app will get smarter as it learns more about each store’s customers and the frequently asked questions for each location.

When I read this, I thought about Pokeman Go. Perhaps the way to generate traffic in a department store is to entice the potential buyers with digital egg hunts. Instead of creatures, one could hunt for bargains with cute digital personas and clever graphics.

IBM Watson does not seem to have the zeitgeist of Pokeman Go. Apps strike me as a little 2007, but I am out of touch. Why not ask Watson?

Stephen E Arnold, July 28, 2016

Facebook Acknowledges Major Dependence on Artificial Intelligence

July 28, 2016

The article on Mashable titled Facebook’s AI Chief: ‘Facebook Today Could Not Exist Without AI’ relates the current conversations involving Facebook and AI. Joaquin Candela, the director of applied machine learning at Facebook, states that “Facebook could not exist without AI.” He uses the examples of the News Feed, ads, and offensive content, all of which involve AI stimulating a vastly more engaging and personalized experience. He explains,

“If you were just a random number and we changed that random number every five seconds and that’s all we know about you then none of the experiences that you have online today — and I’m not only talking about Facebook — would be really useful to you. You’d hate it. I would hate it. So there is value of course in being able to personalize experiences and make the access of information more efficient to you.”

And we thought all Facebook required is humans and ad revenue. Candela makes it very clear that Facebook is driven by machine learning and personalization. He paints a very bleak picture of what Facebook would look like without AI- completely random ads, unranked New Feeds, and offensive content splashing around like beached whale. Only in the last few years, computer vision has changed Facebook’s process of removing such content. What used to take reports and human raters now is automated.

Chelsea Kerwin, July 28, 2016

Sponsored by, publisher of the CyberOSINT monograph

Baidu Hopes Transparency Cleans up Results

July 28, 2016

One of the worries about using commercial search engines is that search results are polluted with paid links. In the United States, paid results are differentiated from organic results with a little banner or font change.  It is not so within China and Seeking Alpha shares an interesting story about a Chinese search engine, “Baidu Cleans Up Search Site, Eyes Value.”  Baidu recently did a major overhaul of its search engine, which was due a long, long time ago. Baidu was more interested in generating profits than providing its users a decent service.   Baidu neglected to inform its users that paid links appeared alongside organic results, but now they have been separated out like paid links in the US.

Results are cleaner, but it did not come in time to help one user:

“For anyone who has missed this headline-grabbing story, the crisis erupted after 21-year-old cancer patient Wei Zexi used Baidu to find a hospital to treat his disease. He trusted the hospital he chose partly because it appeared high in Baidu’s results. But he was unaware the hospital got that ranking because it paid the most in an online auctioning system that has helped to make Baidu hugely profitable. Wei later died after receiving an ineffective experimental treatment, though not before complaining loudly about how he was misled.”

The resulting PR nightmare forced Baidu to clean up its digital act.  This example outlines one of the many differences between US and Chinese business ethics.  On average the US probably has more educated consumers than China, who will call out companies when they notice ethical violations.  While it is true US companies are willing to compromise ethics for a buck, at least once they are caught they cannot avoid the windfall.  China on the other hand, does what it wants when it wants.


Whitney Grace, July 28, 2016
Sponsored by, publisher of the CyberOSINT monograph


In-Q-Tel and Algorithmia: The Widget Approach to Tasks

July 27, 2016

I noted “Algorithmia Lands In-Q-Tel Deal, Adds Deep Learning Capabilities.” Useful write up but one important facet of the deal was omitted from the news item. The trend in some government projects is to have a “open” set of software. The idea of running a query for a needed widget on an ecommerce site hath its charms. The goal is cost reduction and reducing the time required to modify an existing software system. This is the idea behind the DCGS (Distributed Common Ground System) widget approach. The issue I have with brokering algorithms is that most of the algorithms which look like rocket science are actually textbook examples or variations on well known themes. Search and content processing chugs along on 10 methods. When a novel solution becomes available like this insight into Carmichael numbers appears, years may pass before the method can be verified and inserted into the usable methods folder. Now those doing the searching have to know what an algorithm actually does and what settings are required for the numerical recipe to generate an edible croissant. That human knowledge thing is an issue.

Stephen E Arnold, July 27, 2016

X1 Hits a Familiar Marketing Chord

July 27, 2016

I had the job of reviewing some of the 100 eDiscovery systems on offer today. I noted an interesting semantic thrust in “Recent Court Decisions, Key Industry Report Reveal Broken eDiscovery Collection Processes.”

X1 states:

What is needed is an effective, scalable and systemized ESI collection process that makes enterprise eDiscovery collection much more feasible. More advanced enterprise class technology, such as X1 Distributed Discovery, can accomplish system-wide searches that are narrowly tailored to collect only potentially relevant information in a legally defensible manner. This process is better, faster and dramatically less expensive than other methods currently employed.

I highlighted the point that targeted search and collection  can reduce the costs of certain legal processes. Recommind made, prior to its acquisition  by OpenText, similar statements about cost reduction.

It strikes me that the number of 90 percent cost reduction begs for some hard data. Also, with so many eDiscovery firms  competing for law firms’ business, is this once lucrative niche in a race toward Amazon- and Walmart-type pricing?

Perhaps there are several issues involved with the problem lawyers have when using eDiscovery systems? These may be:

  • Unrealistic expectations for an eDiscovery system. Google does one thing; eDiscovery does another.
  • Sticker shock. Despite the low cost of a license fee for a cloud service, there are other costs. These range from customer and engineering support to the need to pay for resources that cost money without doing much to alert the customer that a threshold has been crossed. Taxi meter and old school AT&T network pricing may dampen some eDiscovery licensees’ enthusiasm
  • Federating content from multiple sources sounds wonderful. In my experience, what does one do about exception documents? What do eDiscovery systems “miss”?

As the financial pressure mounts on many law firms, cost controls are likely to rework how these companies do business. The impact on “search” vendors now in the eDiscovery sector may find themselves facing the type of financial realities which caused Recommind to reinvent its future as a company.

Stephen E Arnold, July 27, 2016

OpenText Buys CEM Platform from HP Inc.

July 27, 2016

When Hewlett Packard split up its business in 2015, consumer-printer firm HP Inc. was created; that entity got custody of HP’s CEM platform. Now we learn, from an article at TechCrunch, that “OpenText Acquires HP Customer Experience Content Management for $170 Million.” OpenText expects the deal to generate between $85 million and $95 million in its first year alone. Writer Ron Miller describes:

“The package of products sold to OpenText today come from the HP Engage line and includes HP TeamSite, a web content management tool left over from the purchase of Interwoven (which was actually bought by Autonomy before Autonomy was sold to HP), HP MediaBin, a digital asset management solution, HP Qfiniti, a workforce optimization solution for enterprise contact center management, as well as HP Explore, HP Aurasma, and HP Optimost.”

Some suspect HP was eager to unload this division from the time of the company’s split. Even if that is true, OpenText seems poised to make a lot from their investment; Miller cites the blog post of content-management consultant Tony Byrne:

“The most important thing to understand, though, is that as a vendor OpenText is a financial construct in search of a technology rationale. The company follows a ‘roll-up’ strategy: purchasing older tools for their maintenance revenue streams, streams which — while not always large — are almost always very profitable.”

It is true. In contrast to, say, Google’s method of trying nearly every idea conceived within their company and seeing what sticks, OpenText  tends to be deliberate and calculated in their decisions. We are curious to see where this investment goes.

Based in Waterloo, Ontario, OpenText offers tools for enterprise information management, business process management, and customer experience management. Launched in 1991, the company now serves over 100,000 customers around the world. They are also hiring in several locations as of this writing.


Cynthia Murrell, July 27, 2016

Sponsored by, publisher of the CyberOSINT monograph


Salesforce Blackout

July 27, 2016 is a cloud computing company with the majority of its profits coming from customer relationship management and acquiring commercial social networking apps.  According to PC World, Salesforce recently had a blackout and the details were told in: “Salesforce Outage Continues In Some Parts Of The US.”  In early May, Salesforce was down for over twelve hours due to a file integrity issue in the NA14 database.

The outage occurred in the morning with limited services restored later in the evening. Salesforce divides its customers into instances.  The NA14 instance is located in North America as many of the customers who complained via Twitter are located in the US.

The exact details were:

“The database failure happened after “a successful site switch” of the NA14 instance “to resolve a service disruption that occurred between 00:47 to 02:39 UTC on May 10, 2016 due to a failure in the power distribution in the primary data center,” the company said.  Later on Tuesday, Salesforce continued to report that users were still unable to access the service. It said it did not believe “at this point” that it would be able to repair the file integrity issue. Instead, it had shifted its focus to recovering from a prior backup, which had not been affected by the file integrity issues.”

It is to be expected that power outages like this would happen and they will reoccur in the future.  Technology is only as reliable as the best circuit breaker and electricity flows.  This is why it is recommended to back up your files in more than one place.


Whitney Grace, July 27, 2016
Sponsored by, publisher of the CyberOSINT monograph

Alphabet Google May Dominate AI for Developers

July 26, 2016

Advertising dominance is not enough. Tackling problems like solving death is not enough. Failing at augmented reality is not enough. Google wants to dominate artificial intelligence.

I know. I read “Google Sprints Ahead in AI Building Blocks, Leaving Rivals Wary.” Now quite a few folks have released to open source various types of smart software. But the Alphabet Google thing knows that smart software can cut electricity bills and generate less and less useful search results for me.

The write up takes a different approach. I learned:

But for some competitors, there’s a big downside to adopting Google’s standard. Using TensorFlow will help Google recruit more AI experts by training them on the same tool it uses internally, spotting their code, and hiring the best contributors. It could also let the search-engine provider exert outsize influence over the burgeoning AI ecosystem. If the internet giant dominates in this field, it could gain an advantage in the fast-growing cloud-computing business, turning the popularity of its software into real revenue. “It’s the next big area, and people are worried Google’s going to own the show,” said Ed Lazowska, a computer science professor at the University of Washington who has served on the technical advisory board of Microsoft Corp.’s research lab. “There is a network effect, and it’s a really excellent system.”

What’s the solution? Well, there is Facebook AI, IBM Watson, and a number of other outfits eager to make your software smart.

What’s not to like? Perhaps a silver bullet is speeding on its way to you?

Stephen E Arnold, July 26, 2016

Google and Its Algorithmic Methods Catch Attention in India

July 26, 2016

I read “Google in Legal Trouble in India for Including PM Modi in Top Criminals Search Results.” I have no way of knowing if the write up is accurate or the product of the great Internet news machine. I found the factoids amusing.

I highlighted this passage as fodder for the spirit of Jack Benny or Fred Allen:

The complainant, an advocate named Sushl Kumar Mishra, has argued that a Google search for the “top ten criminals of the world” showed Modi’s photograph in the search results. Mishra has said that while he wrote to Google to remove Modi’s name from the list, he got no response.

Customer service? In response to an email? Okay, this person is an optimist. As an aside, politicians are good actors. Right?

Google’s response was interesting as well:

The company had explained that the results were due to the British daily using an image of Modi with erroneous metadata and that the news articles did not actually link Modi to criminal activity.

Love that smart software.

Stephen E Arnold, July 26, 2016

Next Page »