CyberOSINT banner

ThomsonReuters: Palantir Not Enough Math?

April 6, 2016

I read “TRRI Users Will Gain Access to FiscalNote’s Legislative Modeling Techniques.” The licensees of Palantir Metropolitan and the owner of Westlaw smart software for legal eagles is pushing into new territory. That’s probably good news for stakeholders who have watch ThomsonReuters bump into a bit of a revenue ceiling in the last few years.

According to the write up:

The main benefit of the agreement [with FiscalNote] will grant Thomson Reuters’ Regulatory Intelligence (TRRI) newly extended capabilities across its predictive legislative analytics. TRRI is a global solution that helps clients focus and leverage their regulatory risk. Per the agreement, FiscalNote will help provide TRRI users with likelihood factors and other insights relegated to specifics pieces of legislative passage.

Interesting. I assumed that Palantir’s platform would have the extensibility to handle this type of content processing and analysis. Wrong again.

I learned:

FiscalNote utilizes machine learning and natural language processing in its modeling techniques that help it engineer models to conduct a host of analyses on open government data. In essence, these models allow FiscalNote to automatically analyze how legislation is going to yield any material impact via a combination of factors such as legislators, committee assignments, actions taken, bill versions, and amendments.

Wait, wait, don’t tell me. Westlaw’s smart software which can do many wonderful advanced text processing tricks is not able to perform in the manner of FiscalNote.

My hunch is that the deal has less to do with technologies, extensible or not, and more to do with getting some customers and an opportunity to find a way to pump up those revenues. Another idea: Is ThomsonReuters emulating IBM’s tactic of buying duplicative technology as a revenue rocket booster?

Perhaps Palantir and Westlaw should team up so ThomsonReuters’ customers have additional choices? Think of the XML slicing and dicing strategy with the intelligence and legal technology working in harmony.

Stephen E Arnold, April 6, 2016

Nasdaq Joins the Party for Investing in Intelligence

April 6, 2016

The financial sector is hungry for intelligence to help curb abuses in capital markets, judging by recent actions of Goldman Sachs and Credit Suisse. Nasdaq invests in ‘cognitive’ technology, from BA wire, announces their investment in Digital Reasoning. Nasdaq plans to connect Digital Reasoning algorithms with Nasdaq’s technology which surveils trade data. The article explains the benefits of joining these two products,

“The two companies want to pair Digital Reasoning software of unstructured data such as voicemail, email, chats and social media, with Nasdaq’s Smarts business, which is one of the foremost software for monitoring trading on global markets. It is used by more than 40 markets and 12 regulators. Combining the two products is designed to assess the context, content and relationships behind trading and spot signals that could indicate insider trading, market manipulation or even expenses rules violations.”

We have followed Digital Reasoning, and other intel vendors like them, for quite some time as they target sectors ranging from healthcare to law to military. This is just a case of another software intelligence vendor making the shift to the financial sector. Following the money appears to be the name of the game.


Megan Feil, April 6, 2016

Sponsored by, publisher of the CyberOSINT monograph

Forget World Population, Domain Population Is Overcrowded

April 5, 2016

Back in the 1990s, if you had a Web site without a bunch of gobbidly-gook after the .com, you were considered tech savvy and very cool.  There were plenty of domain names available in those days and as the Internet became more of a tool than a novelty, demand for names rose. It is not as easy anymore to get the desired Web address, says in the article, “Overcrowded Internet Domain Space Is Stifling Demand, Suggesting A Future ‘Not-Com’ Boom.”

Domain names are being snapped up fast, so quickly, in fact, that Web development is being stunted.  As much as 25% of domains are being withheld, equaling 73 million as of summer 2015 with the inability to register domain names that would drive Internet traffic.

“However, as the Internet Corporation for Assigned Names and Numbers (ICANN) has begun to roll out the option to issue brand new top-level domains for almost any word, whether it’s dot-hotel, dot-books or dot-sex – dubbed the ‘not-coms’ – the research suggests there is substantial untapped demand that could fuel additional growth in the domain registrations.”

One of the factors that determine prime Internet real estate is a simple, catchy Web address.  With new domains opening up beyond the traditional .org, .com, .net, .gov endings, an entire new market is also open for entrepreneurs to profit from.  People are already buying not-com’s for cheap with the intention to resale them for a pretty penny.  It bears to mention, however, that once all of the hot not-com’s are gone, we will be in the same predicament as we are now.  How long will that take?


Whitney Grace, April 5, 2016
Sponsored by, publisher of the CyberOSINT monograph

Watson Weakly: Analysis of Harry Potter

April 4, 2016

I noted this write up: “IBM’s Watson Analyzed All the ‘Harry Potter’ Books and Movies — and the Results are Fascinating.” An outfit called Tech Insider appears to have “asked” Watson “what it thought of the Harry Potter original book series and movies.”

IBM, that revenue engine which delights its stakeholders, offered up Vinith Misra, “a research staff member for IBM Watson.”

It appears that Watson did what any second year English major does in between pizza bites and hanging out. Watson “read” the Potter books and “watched” the films. I think Watson was fed movie scripts, but that’s a niggling point. Of course, Watson can handle rich media. Watson is a very capable system for generating some text analytics.

What did Watson discover? I won’t review the findings in one big list of stunners. Let me highlight one finding, which will lure you into the silly listicle. Here you go:

Professor McGonagall ranks the highest of all the characters for intellect.

Useful? Insightful?

IBM’s marketing continues to amaze me. By the way, if I were teaching those college sophomores, I would expect more from an analysis written in a dorm in 15 minutes after a long weekend of partying.

Stephen E Arnold, April 4, 2016

Video: The Top Info Dog

April 4, 2016

I love the capitalist tool. The founder rode a motorcycle. When I was in Manhattan, I had the pleasure of listening to the Malcolm-cycle burble and grunt when talking with a couple of pals. Wonderful that noise and odor.

I read “The Content Pyramid: And Why Video Must Be at the Top.” I am not sure the founder of the capitalist tool was into video. Well, the capitalist tool is an an article with a parental “must” makes this point:

Video is the Matryoshka doll of content.

I did not know that. I know that some folks who shoot videos write scripts, sell them and then other people (who know better than the author) rewrite them.


The write up points out that a video has a script. But the video has pictures and audio.

I need to take a couple of deep breaths. My heart is racing with the impact of these comments.

I learned:

As more and more content consumption goes mobile, it’s usually a necessity to create multiple lengths and optimized formats of video content, so you should always have tiered, multi-channel thinking built in to your editorial process.

So how much video does the capitalist tool have on YouTube? 4,900 videos. But that’s not too many. I ran the query “Forbes” on Google Video and learned that there are 16,900 videos available. I checked Vimeo and learned there were 521 videos. I checked Blinkx and found quite a few false drops.

The problem is that I have never seen a reference to a Forbes video. I do receive mail addressed to my deceased father enjoining him to re-subscribe to the print edition of Forbes Magazine. But the video thing with the podcast, the clips, and the use of video in marketing. Not on my radar.

Remember the “must.” How about adding the concept of “effective”?

Video by itself is a bit of an ego play in my opinion. When no one watches the video or knows a video exists, what’s the point? Right, right. I forget. Some ad agencies love to do video shoots in Half Moon Bay. It is fun. How bright the video shines depends on more the height of the pyramid in my opinion.

Stephen E Arnold, April 4, 2016

Predictive Analytics on a Budget

March 30, 2016

Here is a helpful list from Street Fight that could help small and mid-sized businesses find a data analysis platform that is right for them—“5 Self-Service Predictive Analytics Platforms.”  Writer Stephanie Miles notes that, with nearly a quarter of small and mid-sized organizations reporting plans to adopt predictive analytics, vendors are rolling out platforms for companies with smaller pockets than those of multinational corporations. She writes:

“A 2015 survey by Dresner Advisory Services found that predictive analytics is still in the early stages of deployment, with just 27% of organizations currently using these techniques. In a separate survey by IDG Enterprise, 24% of small and mid-size organizations said they planned to invest in predictive analytics to gain more value from their data in the next 12 months. In an effort to encourage this growth and expand their base of users, vendors with business intelligence software are introducing more self-service platforms. Many of these platforms include predictive analytics capabilities that business owners can utilize to make smarter marketing and operations decisions. Here are five of the options available right now.”

Here are the five platforms listed in the write-up: Versium’s Datafinder; IBM’s Watson Analytics; Predixion, which can run within Excel; Canopy Labs; and Spotfire from TIBCO. See the article for Miles’ description of each of these options.


Cynthia Murrell, March 30, 2016

Sponsored by, publisher of the CyberOSINT monograph



DocPoint and Concept Searching: The ONLY Choice. Huh?

March 24, 2016

DocPoint is a consulting and services firm focusing on the US government’s needs. The company won’t ignore commercial firms’ inquiries, but the line up of services seems to be shaped for the world of GSAAdvantage users.

I noted that DocPoint has signed on to resell the Concept Searching indexing system. In theory, the SharePoint search service performs a range of indexing functions. In actual practice, like my grandmother’s cookies, many of the products are not cooked long enough. I tossed those horrible cookies in the trash. The licensees of SharePoint don’t have the choice I did when eight years old.

DocPoint is a specialist firm which provides what Microsoft cannot or no longer chooses to offer its licensees. Microsoft is busy trying to dominate the mobile phone market and doing bug fixes on the Surface product line.

The scoop about the DocPoint and Concept Searching deal appears in “DocPoint Solutions Adds Concept Searching To GSA Schedule 70.” The Schedule 70 reference means, according to

a long-term contract issued by the U.S. General Services Administration (GSA) to a commercial technology vendor.  Award of a Schedule contract signifies that the GSA has determined that the vendor’s pricing is fair and reasonable and the vendor is in compliance with all applicable laws and regulations. Purchasing from pre- approved vendors allows agencies to cut through red tape and receive goods and services faster. A vendor doesn’t need to win a GSA Schedule contract in order to do business with U.S. government agencies, but having a Schedule contract can cut down on administrative costs, both for the vendor and for the agency. Federal agencies typically submit requests to three vendors on a Schedule and choose the vendor that offers the best value.

To me, the deal is a way for Concept Searching to generate revenue via a third party services firm.

In the write up about the tie up, I highlighted this paragraph which is a single paragraph with an amazing assertion:

A DocPoint partner since 2012, Concept Searching is the only [emphasis added] company whose solutions deliver automatic semantic metadata generation, auto-classification, and powerful taxonomy tools running natively in all versions of SharePoint and SharePoint Online. By blending these technologies with DocPoint’s end-to-end enterprise content management (ECM) offerings, government organizations can maximize their SharePoint investment and obtain a fully integrated solution for sharing, securing and searching for mission-critical information.

Note the statement “only company whose solutions deliver…” “Only” means, according to the Google define function:

No one or nothing more besides; solely or exclusively.

Unfortunately the DocPoint assertion about Concept Searching as the only firm appears to be wide of the mark. Concept Search is one of many companies offering the functions set forth in the content marketing “news” story. In my files, I have the names of dozens of commercial firms offering semantic metadata generation, auto-classification, and taxonomy tools. I wonder if Layer2 or Smartlogic have an opinion about “only”?

Stephen E Arnold, March 24, 2016

Confused about Hadoop, Spark, and MapReduce? Not Necessary Now

March 24, 2016

I read “MapReduce vs. Apache Spark vs. SQL: Your questions answered here and at #StrataHadoop.” The article strikes at the heart of the Big Data boomlet. The options one has are rich, varied, and infused with consequences.

According to the write up:

Forester is predicting total market saturation for Hadoop in two years, and a growing number of users are leveraging Spark for its superior performance when compared to MapReduce.

Yikes! A mid tier consulting firm is predicting the future again. I almost stopped reading, but I was intrigued. Exactly what are the differences among these three systems, which appear to be, really different. MapReduce is a bit of a golden oldie, and there is the pesky thought in my mind that Hadoop is a close relative of MapReduce. The Spark thing is an open source effort to create a system which runs quickly enough to make performance mesh with the idea that engineers have weekends.

The write up states:

As I mentioned in my previous post, we’re using this blog series to introduce some of the key technologies SAS will be highlighting at Strata Hadoop World. Each Q&A features the thought leaders you’ll be able to meet when you stop by the SAS booth #1022. Next up is Brian Kinnebrew who explains how new enhancements to SAS Data Loader for Hadoop can support Spark.

Yikes, yikes. The write up is a plea for booth traffic. In the booth a visitor can learn about the Hadoop, Spark, and MapReduce options.

The most interesting thing about the article is that it presents a series of questions and some SAS-skewed answers. The point is that SAS, the statistics company every graduate student in psychology learns to love, has a Data Loader Version 2.4 which is going to make life wonderful for the Big Data crowd.

I wondered, “Is this extract, transform, and load” all over again?”

The answer is not to get tangled up in the substantive differences among Hadoop, Spark and MapReduce like the title of the article implied. The point is that one can use NoSQL and regular SQL.

So what did I learn about the differences among Hadoop, Spark, and MapReduce?

Nothing. Just content marketing without much content in my view.

SAS, let me know if you want me to explain the differences to someone in your organization.

Stephen E Arnold, March 24, 2016

VPN Disables Right to Be Forgotten for Users in European Union

March 24, 2016

Individuals in the European Union have been granted legal protection to request unwanted information about themselves be removed from search engines. An article from Wired, In Europe,You’ll Need a VPN to See Real Google Search Results, explains the latest on the European Union’s “right to be forgotten” laws. Formerly, privacy requests would only scrub sites with European country extensions like .fr, but now will filter results for privacy for those with a European IP address. However, European users can rely on a VPN to enable their location to appear as if it were from elsewhere. The article offers context and insight,

“China has long had its “Great Firewall,” and countries like Russia and Brazil have tried to build their own barriers to the outside ‘net in recent years. These walls have always been quite porous thanks to VPNs. The only way to stop it would be for Google to simply stop allowing people to access its search engine via a VPN. That seems unlikely. But with Netflix leading the way in blocking access via VPNs, the Internet may yet fracture and localize.”

The demand for browsing the web using surreptitious methods, VPN or otherwise, only seems to be increasing. Whether motivations are to uncover personal information about certain individuals, watch Netflix content available in other countries or use forums on the Dark Web, the landscape of search appears to be changing in a major way.


Megan Feil, March 24, 2016

Sponsored by, publisher of the CyberOSINT monograph

Yellowfin: Emulating i2 and Palantir?

March 22, 2016

I read “New BI Platform Focuses on Collaboration, Analytics.” What struck me about this explanation of a new version of YellowFin is that the company is adding the type of features long considered standard in law enforcement and intelligence. The idea is that visualizations and collaboration are components of a commercial business intelligence solution.

I noted this paragraph:

Other BI vendors have tried to push data preparation and analysis responsibilities onto business users “because it’s easier to adapt what they have to fulfill that goal.” But Yellowfin “isn’t a BI tool attempting to make the business user a techie. It is about presenting data to users in an attractive visual representation, backed-up with some of the most sophisticated collaboration tools embedded into a BI platform on the market.”

The reason for analyst involvement in the loading of data is a way to eliminate the issue of content ownership, indexing, and knowledge of what is in the system’s repository. I am not confident that any system which allows the user to whack away at whatever data have been processed by the system is ready for prime time. Sure, Google can win at Go, but the self driving auto ran into a bus.

The write up, which strikes me as New Age public relations, seems to want me to remember what’s new with YellowFin with this mnemonic example: Curated. Baffled? Here’s what curated means:

  • Consistent: Governed, centralized and managed
  • Usable: by any business to consume analytics
  • Relevant: connected to all the data users need to do their jobs well
  • Accurate: data quality is paramount
  • Timely: Provide real time data and agile content development
  • Engaging: Offer a social or collaborative component
  • Deployed: widely across the organization.

Business intelligence is the new “enterprise search.” I am not sure the use of notions like curated and adding useful functions delivers the impact that some marketers promise. Remember that self driving car. Pesky humans.

Stephen E Arnold, March 23, 2016

« Previous PageNext Page »