Palantir Technologies Snags Another $20 Million

November 28, 2016

I read “Secretive Big Data Firm Palantir Raises $20M in Recent Funding Run: Report.” I learned:

Palantir has quietly raised $20 million in a recent funding run.

The article pointed out that Palantir had previously raised $800 million. The write up added:

Based on the Nov. 23 Form D filing, the date of the first sale for the recent round of funding was Nov 8. Coincidentally, the new United States president Donald Trump was elected on the same day.

I found this statement interest:

Including the latest round of funding, Palantir has raised more than $2 billion to date.

Interesting. By the way, the figures seem to be hefty.

Stephen E Arnold, November 28, 2016

Smart Software Figures Out What Makes Stories Tick

November 28, 2016

I recall sitting in high school when I was 14 years old and listening to our English teacher explain the basic plots used by fiction writers. The teacher was Miss Dalton and he seemed quite happy to point out that fiction depended upon: Man versus man, man versus the environment, man versus himself, man versus belief, and maybe one or two others. I don’t recall the details of a chalkboard session in 1959.

Not to fear.

I read “Fiction Books Narratives Down to Six Emotional Story Lines.” Smart software and some PhDs have cracked the code. Ivory Tower types processed digital versions of 1,327 books of fiction. I learned:

They [the Ivory Tower types] then applied three different natural language processing filters used for sentiment analysis to extract the emotional content of 10,000-word stories. The first filter—dubbed singular value decomposition—reveals the underlying basis of the emotional storyline, the second—referred to as hierarchical clustering—helps differentiate between different groups of emotional storylines, and the third—which is a type of neural network—uses a self-learning approach to sort the actual storylines from the background noise. Used together, these three approaches provide robust findings, as documented on the hedonometer.org website.

Okay, and what’s the smart software say today that Miss Dalton did not tell me more than 50 years ago?

[The Ivory Tower types] determined that there were six main emotional storylines. These include ‘rags to riches’ (sentiment rises), ‘riches to rags’ (fall), ‘man in a hole’ (fall-rise), ‘Icarus’ (rise-fall), ‘Cinderella’ (rise-fall-rise), ‘Oedipus’ (fall-rise-fall). This approach could, in turn, be used to create compelling stories by gaining a better understanding of what has previously made for great storylines. It could also teach common sense to artificial intelligence systems.

Ah, progress.

Stephen E Arnold, November 28, 2016

Iran-Russia Ink Pact for Search Engine Services

November 28, 2016

Owing to geopolitical differences, countries like Iran are turning towards like-minded nations like Russia for technological developments. Russian Diplomat posted in Iran recently announced that home-grown search engine service provider Yandex will offer its services to the people of Iran.

Financial Tribune in a news report Yandex to Arrive Soon said that:

Last October, Russian and Iranian communications ministers Nikolay Nikiforov and Mahmoud Vaezi respectively signed a deal to expand bilateral technological collaborations. During the meeting, Russian Ambassador Vaezi said, We are familiar with the powerful Russian search engine Yandex. We agreed that Yandex would open an office in Iran. The system will be adapted for the Iranian people and will be in Persian.

Iran traditionally has been an extremist nation and at the center of numerous international controversies that indirectly bans American corporations from conducting business in this hostile territory. On the other hand, Russia which is seen as a foe to the US stands to gain from these sour relations.

As of now, .com and .com.tr domains owned by Yandex are banned in Iran, but with the MoU signed, that will change soon. There is another interesting point to be observed in this news piece:

Looking at Yandex.ir, an official reportedly working for IRIB purchased the website, according to a domain registration search.  DomainTools, a portal that lists the owners of websites, says Mohammad Taqi Mozouni registered the domain address back in July.

Technically, and internationally accepted, no individual or organization can own a domain name of a company with any extension (without necessary permissions) that has already carved out a niche for itself online. It is thus worth pondering what prompted a Russian search engine giant to let a foreign governmental agency acquire its domain name.

Vishal Ingole November 28, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Wisdom from the First OReilly AI Conference

November 28, 2016

Forbes contributor Gil Press nicely correlates and summarizes the insights he found at September’s inaugural O’Reilly AI Conference, held in New York City, in his article, “12 Observations About Artificial Intelligence from the O’Reily AI Conference.” He begins:

At the inaugural O’Reilly AI conference, 66 artificial intelligence practitioners and researchers from 39 organizations presented the current state-of-AI: From chatbots and deep learning to self-driving cars and emotion recognition to automating jobs and obstacles to AI progress to saving lives and new business opportunities. … Here’s a summary of what I heard there, embellished with a few references to recent AI news and commentary.

Here are Press’ 12 observations; check out the article for details on any that spark your interest: “AI is a black box—just like humans”; “AI is difficult”; “The AI driving driverless cars is going to make driving a hobby. Or maybe not”; “AI must consider culture and context”; “AI is not going to take all our jobs”; “AI is not going to kill us”; “AI isn’t magic and deep learning is a useful but limited tool”; “AI is Augmented Intelligence”; “AI changes how we interact with computers—and it needs a dose of empathy”; “AI should graduate from the Turing Test to smarter tests”; “AI according to Winston Churchill”; and “AI continues to be possibly hampered by a futile search for human-level intelligence while locked into a materialist paradigm.”

It is worth contemplating the point Press saved for last—are we even approaching this whole AI thing from the most productive angle? He ponders:

Is it possible that this paradigm—and the driving ambition at its core to play God and develop human-like machines—has led to the infamous ‘AI Winter’? And that continuing to adhere to it and refusing to consider ‘genuinely new ideas,’ out-of-the-dominant-paradigm ideas, will lead to yet another AI Winter? Maybe, just maybe, our minds are not computers and computers do not resemble our brains?  And maybe, just maybe, if we finally abandon the futile pursuit of replicating ‘human-level AI’ in computers, we will find many additional–albeit ‘narrow’–applications of computers to enrich and improve our lives?

I think Press is on to something. Perhaps we should admit that anything approaching Rosie the Robot is still decades away (according to conference presenter Oren Etzioni). At this early date, we may do well to accept and applaud specialized AIs that do one thing very well but are completely ignorant of everything else. After all, our Roombas are unlikely to attempt conquering the world.

Cynthia Murrell, November 28, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

IBM Watson and Its QAMs

November 27, 2016

“What’s a QAM?” some may ask. The answer is revealed in “What Can Modern Watson Do?” The answer is a question answering machine.” The idea is that one talks to a computing device and the device provides high value, on point output. One can also type the question, but mobile phones are not designed for query formulation. Phones are designed to do Facebook, Twitter, and app-based functions.

The write up is interesting because it reveals that Watson is not a game show winner or the reason to spend a week in Las Vegas at the World of Watson conference. Nope. I learned:

IBM’s Watson as it exists today is as close as we’ve come to a single integrated platform for AI.  It contains all the capabilities for image and video, natural language speech and text input and output, and the most comprehensive knowledge recovery module yet combined together.

Consider the gap between IBM Watson and its many competitors. Watson must be making life very difficult for the companies offering smart software systems. One feels sorry for Amazon, Facebook, Google, and other outfits who are not in Watson’s league.

The write up explains that Watson does image and video processing, text and speech processing, and knowledge retrieval.

What caught my attention was the notion of QAMs. I learned that knowledge retrieval (which to me means search) is complex. IBM has not been able to get the media excited about search as about Watson’s other capabilities. Is this a failing of IBM marketing, the system, or the media. Perhaps IBM’s CEO should tweet late at night to amp up the interest in search and retrieval?

The write up points out that Watson combines natural language processing, hypothesis generation and evaluation, and evidence based learning with [a] image processing, [b] text and speech processing, and [c] knowledge retrieval. When these capabilities are placed in one single system, the future is here or maybe just around the corner.

The write up invokes Dave Schubmehl, a person who tried to sell reports containing my information on Amazon for $3,500 a whack for eight pages without my permission. I wonder if Watson assisted him in making this decision? Here’s a passage mentioning this maven which I highlighted in yellow:

David Schubmehl, an analyst at IDC compares IBMs new playbook in AI with Microsoft’s Windows in personal computing and Google’s Android OS in mobile. “IBM is trying to do the same thing with Watson,” he said, “open up a platform, make it available for others, and democratize the technology.”

There you go. IBM Watson is the equivalent of Windows and Google Android. Yep, that works except the analogy is undermined by reality. Watson is not either of these “products.” Watson is a collection of open source code, acquired technology like Vivisimo’s, and home brew code. Keep in mind that IBM Almaden invented some of the guts of Google and did zero with the technology. Clever, right?

The write up identifies these Watson “products for end users.” Yep, “end users” just like me.

  • Watson Virtual Agent. Yes, automated customer service
  • Watson Explorer. Learn about your customers and their reaction to automated Watson customer service.
  • Watson Analytics. A free version is available.
  • Watson Knowledge Studio. Do end users code?
  • Watson customized for specific industries. Yep, end users build custom apps when they are not binge watching.
  • Watson health. Got cancer? Oh, don’t forget to have a doc with access to Watson.

If you are a developer, you can code even more applications. I think the end user examples say quite a bit about Watson. Watson is a collection of stuff. IBM is trying to create a business from odds and ends. I am confident that with a wizard like Dave Schubmehl, Watson will be a success because Watson is just like Windows and Android. Great mid tier consulting thinking. Just like Windows except for the revenue. Just like Android except for the market share. Hey, close enough for horseshoes.

Stephen E Arnold, November 27, 2016

EasyAsk Guarantees Revenue Boost with Its eCommerce Search System

November 26, 2016

I read “How EasyAsk Will Help You Drive 23 to 121% Higher eCommerce Revenues: Guaranteed.” The headline is quite different from most search vendors’ announcements. Search vendors, in my experience, do not guarantee anything: Uptime, fees, performance. EasyAsk, a natural language search technology vendor, is guaranteeing more eCommerce revenues. Like most information available online, I assume that the facts are correct.

I highlighted this statement:

Within 90 days of the EasyAsk implementation, 95% of internal searches were returning the right results – nearly eliminating the dreaded no-results pages. The results have been outstanding;

  • Search conversion has increased by 54%
  • Revenue from search has seen a boost of over 71%
  • Transactions are up 81%

Unlike SOLR, EasyAsk offers powerful merchandising tools that are intuitive, easy-to-use and maintained by business users instead of programmers.

Now the “guarantee” part:

We [EasyAsk] will contractually guarantee that EasyAsk will drive at least 20% more revenue from search.

Here’s how:

  • We will take a baseline benchmark measuring revenue, conversion rate and average transactions on your existing search engine.
  • We will work with you to deploy and implement EasyAsk’s eCommerce suite to provide you with advanced Natural Language semantic search and merchandising.
  • Within 90 days of implementation, we will perform a new benchmark that measures revenue, conversion rate and average transactions and compare them with the original baseline. EasyAsk will contractually guarantee to drive at least 20% more revenue.

The write up explains that there is no risk to the eCommerce vendor who embraces EasyAsk.

There you go. A New Year’s gift which is six weeks early.

Stephen E Arnold, November 26, 2016

More Palantir Pressure on DCGS Vendors?

November 25, 2016

I read a personnel announcement. For most people, the report that a Silicon Valley type joined Donald J. Trump’s transition team is a ho hum, so what moment. You decide for yourself. Navigate to “Trump’s Transition Team Adds VC from Thiel’s Founders Fund.” I highlighted this bit of real news from real journalists as spot on (I assume, of course).

Trae Stephens, a principal at Thiel’s Founders Fund, is being appointed to Trump’s defense transition team, said people familiar with the matter. He will help shape policy and vet Defense Department staff but isn’t expected to take a role in the administration, said the people, who asked not to be identified because they weren’t authorized to speak publicly.

When I read this, several ideas flapped across my mind.

First, the DCGS incumbents now have to deal with two Palanterians providing input on how to use information to achieve operational goals. One Hobbit was not good for outfits accustomed to having direct inputs with regard to certain procurements and technology decisions. Two Hobbits. Yikes.

Second, I doubt that Donald J. Trump understands that DCGS is based on a very big vision of federating information from a wide range of sources, deploying systems which can lose connectivity in certain situations, and require that system users keep on their toes with regard to the freshness of the data being manipulated. My hunch is that explaining why a system which has been in the works for more than a decade and has consumed billions of dollars is not going to fit into a sound bite or a tweet. Explanations may be a bigger problem than the venerable traditional Beltway approach to government software. Palantir’s Hobbits show pictures and clever stuff like wheel menus.

Third, the Hobbits are not likely to bring up the past. The future is sort of now in the Donald J. Trump moment. When the Hobbits fire up a laptop and generate a bubble gum card about an alleged bad actor, my thought is that Donald J. Trump will say, “That’s huge.” The fact that Gotham is a product and ready to install and use may elicit a “That’s great.” Who will say that about the DCGS console? I know. The vendors holding the prime DCGS contracts.

In short, some of those vendor meetings underway in Beltway office buildings are likely to be interesting. And stressful. Yep, stressful.

Stephen E Arnold, November 25, 2016

Need Data Integration? Think of Cisco. Well, Okay

November 25, 2016

Data integration is more difficult than some of the text analytics’ wizards state. Software sucks in disparate data and “real time” analytics systems present actionable results to marketers, sales professionals, and chief strategy officers. Well, that’s not exactly accurate.

Industrial strength data integration demands a company which has bought a company which acquired a technology which performs data integration. Cisco offers a system that appears to combine the functions of Kapow with the capabilities of Palantir Technologies’ Gotham and tosses in the self service business information which Microsoft touts.

Cisco acquired Composite Information in 2013. Cisco now offers the Composite system as the Cisco Information Server. Here’s what the block diagram of the federating behemoth looks like. You can get a PDF version at this link.

image

The system is easy to use. “The graphical development and management environments are easy to learn and intuitive to use,” says the Cisco Teradata information sheet. For some tips about the easy to use system check out the Data Virtualization Cisco Information Server blog. A tutorial, although dated is, at this link. Note that the block diagram between 2011 and the one presented above has not significantly changed. I assume there is not much work required to ingest and make sense of the Twitter stream or other social media content.
The blog has one post and was last updated in 2011. But there is a YouTube video at this link.

The system includes a remarkable range of features; for example:

  • Modeling which means import and transform what Cisco calls “introspect”, create a model and figure out how to make it run at an acceptable level of performance, and expose the data to other services. (Does this sound like iPhrase’s and Teratext’s method? It does to me.)
  • Search
  • Transformation
  • Version control and governance
  • Data quality control and assurance
  • Outputs
  • Security
  • Administrative controls.

The time required to create this system is, according to Cisco Teradata, is “over 300 man years.”

The licensee can plug the system into an IBM DB2 running on a z/OS8 “handheld”. You will need a large hand by the way. No small hands need apply.

Stephen E Arnold, November 25, 2016

Pitching All Source Analysis: Just Do Dark Data. Really?

November 25, 2016

I read “Shedding Light on Dark Data: How to Get Started.” Okay, Dark Data. Like Big Data, the phrase is the fruit of the nomads at Garner Group. The person embracing this sort of old concept is an outfit OdinText. Spoiler: I thought the write up was going to identify outfits like BAE Systems, Centrifuge Systems, IBM Analyst’s Notebook, Palantir Technologies, and Recorded Future (an In-Q-Tel and Google backed outfit). Was I wrong? Yes.

The write up explains that a company has to tackle a range of information in order to be aware, informed, or insightful. Pick one. Here’s the list of Dark Data types, which the aforementioned companies have been working to capture, analyze, and make sense of for almost 20 years in the case of NetReveal (Detica) and Analyst’s Notebook. The other companies are comparative spring chickens with an average of seven years’ experience in this effort.

  • Customer relationship management data
  • Data warehouse information
  • Enterprise resource planning information
  • Log files
  • Machine data
  • Mainframe data
  • Semi structured information
  • Social media content
  • Unstructured data
  • Web content.

I think the company or non profit which tries to suck in these data types and process them may run into some cost and legal issues. Analyzing tweets and Facebook posts can be useful, but there are costs and license fees required. Frankly not even law enforcement and intelligence entities are able to do a Cracker Jack job with these content streams due to their volume, cryptic nature, and pesky quirks related to metadata tagging. But let’s move on. To this statement:

Phone transcripts, chat logs and email are often dark data that text analytics can help illuminate. Would it be helpful to understand how personnel deal with incoming customer questions? Which of your products are discussed with which of your other products or competitors’ products more often? What problems or opportunities are mentioned in conjunction with them? Are there any patterns over time?

Yep, that will work really well in many legal environments. Phone transcripts are particularly exciting.

How does one think about Dark Data? Easy. Here’s a visualization from the OdinText folks:

image

Notice that there are data types in this diagram NOT included in the listing above. I can’t figure out if this is just carelessness or an insight which escapes me.

How does one deal with Dark Data? OdinText, of course. Yep, of course. Easy.

Stephen E Arnold, November 25, 2016

Machine Learning Does Not Have All the Answers

November 25, 2016

Despite our broader knowledge, we still believe that if we press a few buttons and press enter computers can do all work for us.  The advent of machine learning and artificial intelligence does not repress this belief, but instead big data vendors rely on this image to sell their wares.  Big data, though, has its weaknesses and before you deploy a solution you should read Network World’s, “6 Machine Learning Misunderstandings.”

Pulling from Juniper Networks’s security intelligence software engineer Roman Sinayev explains some of the pitfalls to avoid before implementing big data technology.  It is important not to take into consideration all the variables and unexpected variables, otherwise that one forgotten factor could wreck havoc on your system.  Also, do not forget to actually understand the data you are analyzing and its origin.  Pushing forward on a project without understanding the data background is a guaranteed fail.

Other practical advice, is to build a test model, add more data when the model does not deliver, but some advice that is new even to us is:

One type of algorithm that has recently been successful in practical applications is ensemble learning – a process by which multiple models combine to solve a computational intelligence problem. One example of ensemble learning is stacking simple classifiers like logistic regressions. These ensemble learning methods can improve predictive performance more than any of these classifiers individually.

Employing more than one algorithm?  It makes sense and is practical advice why did that not cross our minds? The rest of the advice offered is general stuff that can be applied to any project in any field, just change the lingo and expert providing it.

Whitney Grace, November 25, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta