CyberOSINT banner

An Open Source Search Engine to Experiment With

May 1, 2016

Apache Lucene receives the most headlines when it comes to discussion about open source search software.  My RSS feed pulled up another open source search engine that shows promise in being a decent piece of software.  Open Semantic Search is free software that cane be uses for text mining, analytics, a search engine, data explorer, and other research tools.  It is based on Elasticsearch/Apache Solrs’ open source enterprise search.  It was designed with open standards and with a robust semantic search.

As with any open source search, it can be programmed with numerous features based on the user’s preference.  These include, tagging, annotation, varying file format support, multiple data sources support, data visualization, newsfeeds, automatic text recognition, faceted search, interactive filters, and more.  It has the benefit that it can be programmed for mobile platforms, metadata management, and file system monitoring.

Open Semantic Search is described as

“Research tools for easier searching, analytics, data enrichment & text mining of heterogeneous and large document sets with free software on your own computer or server.”

While its base code is derived from Apache Lucene, it takes the original product and builds something better.  Proprietary software is an expense dubbed a necessary evil if you work in a large company.  If, however, you are a programmer and have the time to develop your own search engine and analytics software, do it.  It could be even turn out better than the proprietary stuff.


Whitney Grace, May 1, 2016
Sponsored by, publisher of the CyberOSINT monograph

Nasdaq Joins the Party for Investing in Intelligence

April 6, 2016

The financial sector is hungry for intelligence to help curb abuses in capital markets, judging by recent actions of Goldman Sachs and Credit Suisse. Nasdaq invests in ‘cognitive’ technology, from BA wire, announces their investment in Digital Reasoning. Nasdaq plans to connect Digital Reasoning algorithms with Nasdaq’s technology which surveils trade data. The article explains the benefits of joining these two products,

“The two companies want to pair Digital Reasoning software of unstructured data such as voicemail, email, chats and social media, with Nasdaq’s Smarts business, which is one of the foremost software for monitoring trading on global markets. It is used by more than 40 markets and 12 regulators. Combining the two products is designed to assess the context, content and relationships behind trading and spot signals that could indicate insider trading, market manipulation or even expenses rules violations.”

We have followed Digital Reasoning, and other intel vendors like them, for quite some time as they target sectors ranging from healthcare to law to military. This is just a case of another software intelligence vendor making the shift to the financial sector. Following the money appears to be the name of the game.


Megan Feil, April 6, 2016

Sponsored by, publisher of the CyberOSINT monograph

Yellowfin: Emulating i2 and Palantir?

March 22, 2016

I read “New BI Platform Focuses on Collaboration, Analytics.” What struck me about this explanation of a new version of YellowFin is that the company is adding the type of features long considered standard in law enforcement and intelligence. The idea is that visualizations and collaboration are components of a commercial business intelligence solution.

I noted this paragraph:

Other BI vendors have tried to push data preparation and analysis responsibilities onto business users “because it’s easier to adapt what they have to fulfill that goal.” But Yellowfin “isn’t a BI tool attempting to make the business user a techie. It is about presenting data to users in an attractive visual representation, backed-up with some of the most sophisticated collaboration tools embedded into a BI platform on the market.”

The reason for analyst involvement in the loading of data is a way to eliminate the issue of content ownership, indexing, and knowledge of what is in the system’s repository. I am not confident that any system which allows the user to whack away at whatever data have been processed by the system is ready for prime time. Sure, Google can win at Go, but the self driving auto ran into a bus.

The write up, which strikes me as New Age public relations, seems to want me to remember what’s new with YellowFin with this mnemonic example: Curated. Baffled? Here’s what curated means:

  • Consistent: Governed, centralized and managed
  • Usable: by any business to consume analytics
  • Relevant: connected to all the data users need to do their jobs well
  • Accurate: data quality is paramount
  • Timely: Provide real time data and agile content development
  • Engaging: Offer a social or collaborative component
  • Deployed: widely across the organization.

Business intelligence is the new “enterprise search.” I am not sure the use of notions like curated and adding useful functions delivers the impact that some marketers promise. Remember that self driving car. Pesky humans.

Stephen E Arnold, March 23, 2016

Tech Unicorns May Soon Disappear as Fast as They Appeared

March 15, 2016

Silicon Valley “unicorns”, private companies valued at one billion or more, may not see the magic last. The article Palantir co-founder Lonsdale calls LinkedIn plunge a bad sign for unicorns from Airline Industry Today questions the future for companies like LinkedIn whose true value has yet to result in ever-increasing profits. After disappointing Wall Street with lower earnings and revenue, investors devalued LinkedIn by about $10 billion. Joe Lonsdale, the Formation 8 venture investor who co-founded Palantir Technologies is quoted stating,

“A lot of LinkedIn’s value, according to how many of us think about it, is tied to what it will achieve in the next five to 10 years,” Lonsdale said in an appearance on CNBC’s “Squawk Alley” on Friday. “It is very similar to a unicorn in that way. Yes, it is making a few billion in revenue and it’s a public company but it has these really big long-term plans as well and is very similar to how you see these other companies.” He added a lot of people who have been willing to suspend disbelief aren’t doing that anymore. “At this point, people are asking, ‘Are you actually going to be able to keep growing?’ And they’re punishing the unicorns and punishing the public companies the same way.”

Lonsdale understands why many private companies postpone an IPO for as long as possible, given these circumstances. Regardless of the pros and cons of when a company should go public, the LinkedIn devaluation seems as if it will send a message. Whether that message is one that fearmongers similar companies into staying private for longer or one that changes profitability norms for younger tech companies remains to be seen.


Megan Feil, March 15, 2016

Sponsored by, publisher of the CyberOSINT monograph


Delve Is No Jarvis

March 3, 2016

A podcast at SearchContentManagement, “Is Microsoft Delve Iron Man’s Edwin Jarvis? No Way,” examines the ways Delve has yet to live up to its hype. Microsoft extolled the product when it was released as part of the Office 365 suite last year. As any developer can tell you, though, it is far easier to market than deliver polished software. Editor Lauren Horwitz explains:

“While it was designed to be a business intelligence (BI), enterprise search and collaboration tool wrapped into one, it has yet to make good on that vision. Delve was intended to be able to search users’ documents, email messages, meetings and more, then serve up relevant content and messages to them based on their content and activities. At one level, Delve has failed because it hasn’t been as comprehensive a search tool as it was billed. At another level, users have significant concerns about their privacy, given the scope of documents and activities Delve is designed to scour. As BI and SharePoint expert Scott Robinson notes in this podcast, Delve was intended to be much like Edwin Jarvis, butler and human search tool for Iron Man’s Tony Stark. But Delve ain’t no Jarvis, Robinson said.”

So, Delve was intended to learn enough about a user to offer them just what they need when they need it, but the tool did not tap deeply enough into the user’s files to effectively anticipate their needs. On top of that, it’s process is so opaque that most users don’t appreciate what it is doing, Robinson indicated. For more on Delve’s underwhelming debut, check out the ten-minute podcast.


Cynthia Murrell, March 3, 2016

Sponsored by, publisher of the CyberOSINT monograph


Palantir: A Dying Unicorn or a Mad, Mad Sign?

February 26, 2016

I learned that some wag posted a Mad Magazine-type cartoon with an MBA-ish message. I am not sure if this is a message from the heart of a disgruntled Hobbit or someone angling for a writer’s job on a late night talk show.

Here’s the image I saw:

dead unicorn final

The point seems to be that the value of Palantir is in doubt. With the roiling of the financial valuations for outfits with billion dollar plus valuations, employees who work for stakes in a zoon zoon outfit may be in a cold, cold night.

I have inserted this alleged real-deal poster in my forthcoming overview of Palantir. If you want to reserve a copy, write benkent2020 [at] yahoo dot com. The 50 page report from ArnoldIT is US$99. The report will be available for sale in April 2016.

The report covers Palantir’s differences from Autonomy’s and i2’s augmented intelligence systems, examples of the “helper” interfaces, and a gathering of open source information about the firm. We have examined Palantir’s publicly accessible technical materials and identified 18 interesting technical innovations. A subset of this larger Palantir analysis will be included in the forthcoming Dark Web Notebook. I will offer some general comments in my forthcoming interview for the Singularity One on One video podcast as well.

Exciting if you follow how search-centric systems are shaped to perform value-added services for government and commercial clients. Open source with lipstick is a business model I find quite interesting to think about.

Stephen E Arnold, February 26, 2016

Startup Semantic Machines Scores Funding

February 26, 2016

A semantic startup looks poised for success with experienced  executives and a hefty investment, we learn from “Artificial Intelligence Startup Semantic Machines Raises $12.3 Million” at VentureBeat. Backed by investors from Bain Capital Ventures and General Catalyst Partners, the enterprise focuses on deep learning and improved speech recognition. The write-up reveals:

“Last year, Semantic Machines named Larry Gillick as its chief technology officer. Gillick was previously chief speech scientist for Siri at Apple. Now Semantic Machines is looking to go further than Siri and other personal digital assistants currently on the market. ‘Semantic Machines is developing technology that goes beyond understanding commands, to understanding conversations,’ the startup says on its website. ‘Our Conversational AI represents a powerful new paradigm, enabling computers to communicate, collaborate, understand our goals, and accomplish tasks.’ The startup is building tools that third-party developers will be able to use.”

Launched in 2014, Semantic Machines is based in Newton, Massachusetts, with offices in Berkeley and Boston. The startup is also seeking to hire a few researchers and engineers, in case anyone is interested.


Cynthia Murrell, February 26, 2016

Sponsored by, publisher of the CyberOSINT monograph

A Guide to Google-Ize Your Business

February 16, 2016

To Google is a verb, meaning to search specifically for information on the Google search engine.  If a user is unable to find information on Google, they either change their key words or look for a different option.  In other words, if you are not pulling up on Google than you might as well not exist.  Perhaps it is a little drastic to make the claim, but without a Web presence users, who double as consumers, are less likely to visit your business.  Consumers take an active approach to shopping these days by doing research before they visit or purchase any goods or services.  A good Web presence alerts them to a company’s capabilities and how it can meet the consumers’ needs.

If you are unsure of how to establish a Web presence, much less a Google Web presence then there is a free eBook to help you get started.  The Reach Local blog posted information about “Master Google My Business With Our New Ebook.” Google My Business is a free tool from Google about how to publish your business information in Google+, Google Maps, and local search results.

“Without accurate and up to date information on Google, you could be missing out on leads and potential customers either by having the wrong phone number and address listed or by not appearing at all in local search results for products and services relevant to your business.  We want to help you take control of your information on the web, so we put together a helpful eBook that explains what Google My Business is, how to set up and verify your business, and tips for managing your information and tracking your progress.”

The free eBook “Your Guide To Google My Business” written by the Reach Local folks is an instruction manual on how to take advantage of the Google tool without going through the headache of trying to understand how it works.  Now if only Windows 10 would follow a similar business pattern to help users understand how it works.



Whitney Grace, February 16, 2016
Sponsored by, publisher of the CyberOSINT monograph


Gartner and Business Intelligence Magic Thing

February 14, 2016

I love consultants, especially mid tier consultants. The idea is that folks who are reasonably pleasant can become experts in various market sectors is a signal that optimism is alive and thriving in a sketchy economic swamp.

The mid tier consultants are a fave. These outfits provide more tradition than the webmaster or Visual Basic programmer who is out of a job. The ease with which one can become a consultant lends a certain squishiness to Lone Rangers offering expertise for hire.

The blue chip outfit are just too expensive for many folks who know they need help. Think of the difference between someone who jets to Lyon for lunch and the person who grabs a slice in Midtown.

Thus, blue chip outfits (the top drawer firms), the azure chip firms (companies either on their way up or down in the expertise Great Chain of Being), and the gray chip folks. The gray chip folks are the disaffected middle school teacher who decides to become a self appointed expert in sponsored content for search engine optimization.

The write up “Critiquing the Gartner BI and Analytics MQ” will not elicit much of a response from the mid tier outfit responsible for the “analysis.” Legal eagles slap when the actual quadrant thing is reproduced.

But the write up hits some nerves in the sagging neck of the azure chip services firm; for example:

  1. Companies excluded for no apparent reason. (Maybe these outfits rejected the azure chip firm’s blandishments to buy services and be better understood?)
  2. A “kitchen sink” approach. (Maybe this means dumping stuff into a container and binge watching Happy Days on Hulu? Stuff breaks when hasty hands place dirty dishes in a sink.)
  3. Products are mixed up. The example is Design Studio. (Aren’t these software components pretty much the same? Sure they are, gentle mid tier consultant getting smart by searching Google for info. Sure they are.)
  4. Inconsistency. (The write up displays actual, high value, super secret, for some eyes only magic thingies. I looked at each graph and was confused in terms of what was presented and how the classifications changed in the span of one fiscal year. Aren’t I the dunce?)

The write up is not about hell fire and brimstone. Here’s the peace offering after the carpet bombing:

To be fair on Gartner, they have made a solid effort at explaining their rationale and, given there are some 500 vendors globally, vying for attention, narrowing down to this selection is a valiant effort. The care with which Gartner has made its understanding known is also commendable, even if some of those explanations are questionable. Another problem with the report is that it is static. It is a snapshot at a point in time that is biased in favor of one constituency and which does not, in my view, adequately recognize the necessary and sometimes difficult tensions that exist between IT and lines of business when it comes to rationalizing or consolidating BI tools in an enterprise setting. I think Gartner has done the industry a major favor by decoupling the reporting element and focusing upon the modern approach to BI. But that’s not enough.

Maybe another azure chip outfit will leap into this opportunity. A mere 500 vendors. The number seems low to me. I eagerly await the next intellectual semi-truck load of insights from the azure chip sector. Yes, eager am I.

Stephen E Arnold, February 14, 2016

Cybercrime as a Service Impacts Hotel Industry and Loyalty Points

February 4, 2016

The marketplaces of the Dark Web provide an interesting case study in innovation. Three types of Dark Web fraud aimed at the hotel industry, for example, was recently published on Cybel Blog. Delving into the types of cybercrime related to the hospitality industry, the article, like many others recently, discusses the preference of cybercriminals in dealing with account login information as opposed to credit cards as detectability is less likely. Travel agencies on the Dark Web are one such way cybercrime as a service exists:

“Dark Web “travel agencies” constitute a third type of fraud affecting hotel chains. These “agencies” offer room reservations at unbeatable prices. The low prices are explained by the fact that the seller is using fraud and hacking. The purchaser contacts the seller, specifying the hotel in which he wants to book a room. The seller deals with making the reservation and charges the service to the purchaser, generally at a price ranging from a quarter to a half of the true price per night of the room. Many sellers boast of making bookings without using stolen payment cards (reputed to be easy for hotels to detect), preferring to use loyalty points from hacked client accounts.”

What will they come up with next? The business to consumer (B2C) sector includes more than hotels and presents a multitude of opportunities for cybertheft. Innovation must occur on the industry side as well in order to circumvent such hacks.


Megan Feil, February 4, 2016

Sponsored by, publisher of the CyberOSINT monograph

Next Page »