BA Insight: More Auto Classification for SharePoint

April 30, 2015

I thought automatic indexing and classifying of content was a slam dunk. One could download Elastic and Carrot2 or just use Microsoft’s tools to whip up a way to put accounting tags on accounting documents, and planning on strategic management documents.

There are a number of SharePoint centric “automated solutions” available, and now there is one more.

I noticed on the BA Insight Web site this page:


There was some rah rah in US and Australian publications. But the big point is that either SharePoint administrators have a problem that existing solutions cannot solve or the competitors’ solutions don’t work particularly well.

My hunch is that automatic indexing and classifying in a wonky SharePoint set up is a challenge. The indexing can be done by humans and be terrible. Alternatively, the tagging can be done by an automated system and be terrible.

The issues range from entity resolution (remember the different spellings of Al Qaeda) to “drift.” In my lingo, “drift” means that the starting point for automated indexing just wanders as more content flows through the system and the administrator does not provide the time consuming and often expensive tweaking to get the indexing back on track.

There are smarter systems than some of those marketed to the struggling SharePoint licensees. I profile a number of NGIA systems in my new monograph CyberOSINT: Next Generation Information Access.

The SharePoint folks are not featured in my study because the demands of real time, multi lingual, real time content processing do not work with solutions from more traditional vendors.

On any given day, I am asked to sit through Webinars about concepts, semantics, and classification. If these solutions worked, the market for SharePoint add in would begin to coalesce.

So far, dealing with the exciting world of SharePoint content processing remains a work very much in progress.

Stephen E Arnold, April 30, 2015

Amazon: The Digital Cost Cutting Quest

April 30, 2015

Nope, Amazon is not playing games. With its cloud services getting some love, the company wants to wallow in affection. I read “Amazon Pays $20M-$50M for ClusterK, the Startup That Can Run Apps on AWS at 10% of the Regular Price.” Who knows if the 10 percent figure is fudge.

Who cares?

I believe that Amazon’s cloud competitors will perk up. Hewlett Packard, IBM, and Microsoft have big cloud plans. Microsoft advertises a cloud that endures tough weather. Okay.

But the issue is cost. Microsoft gives stuff away. HP does the mitosis thing. IBM pumps out PR for Watson authored cookbooks AND quantum computing chips.

Amazon, on the other hand, continues to push for the undisputed crown as the digital Wal-Mart. My hunch is that price competition may be more important than the cloud prices its competitors are offering. Perception is important. Do you want lower costs or free Word, two HPs, and one collection of IBM financial reports. I go with lower costs.

Stephen E Arnold, April 30, 2015

Defense Contractor Makes Leap Investment Into Cybersecurity  

April 30, 2015

The expression goes “you should look before you leap,” meaning you should make plans and wise choices before you barrel headfirst into what might be a brick wall.  Some might say Raytheon could be heading that way with their recent investment, but The Wall Street Journal says they could be making a wise choice in the article, “Raytheon To Plow $1.7 Billion Into New Cyber Venture.”

Raytheon recently purchased Websense Inc., a cybersecurity company with over 21,000 clients.  Websense will form the basis of a new cyber joint venture and it is projected to make $500 million in sales for 2015.  Over the next few years, Raytheon predicts the revenue will surge:

“Raytheon, which is based in Waltham, Mass., predicted the joint venture would deliver high-single-digit revenue growth next year and mid-double-digit growth in 2017, and would be profitable from day one. Raytheon will have an 80% stake in the new cyber venture, with Vista Partners LLC holding 20%.”

While Raytheon is a respected name in the defense contracting field, their biggest clients have been with the US military and intelligence agencies.  The article mentions how it might be difficult for Raytheon’s sales team and employees to switch to working with non-governmental clients.  Raytheon, however, is positioned to use Websense’s experience with commercial clients and its own dealings within the security industry to be successful.

Raytheon definitely has looked before its leapt into this joint venture.  Where Raytheon has shortcomings, Websense will be able to compensate and vice versa.

Whitney Grace, April 30, 2015
Sponsored by, publisher of the CyberOSINT monograph

Altiar Decides to Embed dtSearch Engine

April 30, 2015

PR Newswire has a big announcement for fans of dtSearch Engine: “Announcing The Altiar Cloud-Based (Optimized For Microsoft Azure) ECM Platform Embedding The dtSearch Engine.”  Altiar is a leading enterprise collaborative content management platform based in the cloud, developed for prime optimization in Microsoft Azure.  To improve the enterprise content system, dtSearch’s search engine (its headlining product) will be integrated into Altiar platform.

Altair wants to improve how users find content on the platform.  Users can upload and create brand new content on Altair, but with files from so many different programs it can be confusing to manage and locate them.  Altair hopes to remedy any search problems with the integration:

” ‘Utilizing the power of dtSearch Engine at the core, users can search across the entire database of files uploaded by other users as well as manage their own uploads simply and quickly,’ explains Altiar.  ‘Search results deliver relevant results from the content within every file as well as any additional data provided at upload.’”

Altair restates what we already know about search: it is one of the most important functions of technology and without out people would not be able to track down their content.  Comprehensive search across multiple programs is a standard feature in all computers these days.  Is searching the cloud more complex than a regular system?  What improvements need to be made to make search handle the extra work?

Whitney Grace, April 30, 2015
Sponsored by, publisher of the CyberOSINT monograph

Microsoft Goes Mobile with Delve

April 30, 2015

Microsoft has made enhancements to the core functionality of Delve, as well as rolling out native mobile app versions for iOS and Android. ZDNet breaks the news in their article, “Microsoft Delivers iOS, Android Versions of Delve.”

The article begins:

“Microsoft has made native mobile versions of its Delve search and presentation app available for Android phones, Android wear devices and iPhones. Delve presents in card-like form information from Exchange, OneDrive for Business, SharePoint Online and Yammer enterprise-social networking components. Over the coming months Delve will be adding more content sources, including email attachments, OneNote and Skype for Business.”

This seems like a Microsoft component that has great potential for mobile use, since its focus is “at a glance” information retrieval. Keep an eye on to see what Stephen E. Arnold has to say about it in coming months. Arnold has made a career out of following all things search and enterprise, and he reports his findings at His dedicated SharePoint feed collects a lot of interesting reporting regarding SharePoint and the rest of Microsoft productivity offerings.

Emily Rae Aldridge, April 30, 2015

Sponsored by, publisher of the CyberOSINT monograph


Tweet Storm: Ah, Social Media Mavens at Work

April 29, 2015

I think we tweet stories posted to this blog. Don’t know. Don’t care. A while back someone sent me an email pointing out that I was promoting a naked Miley Cyrus. Odd. I write about online information and content processing. Not much about naked. Not much about Miley Cyrus, a Disney confection, right? When I think of Disney, I recall a conversation with one of that outfit’s senior managers. The message conveyed to me was that Infoseek was the greatest thing since sliced bread. My analysis was different. Fortunately I think the invoice cleared. Maybe not. Disney is not an IT outfit at its core. But Twitter somehow had connected Beyond Search with the aforementioned Miley person. I think we had to call some folks we knew. Even then, Twitter required several weeks to figure out how Miley and me became digitally connected. Shudder.

I read with considerable amusement “How One Tweet Wiped $8bn Off Twitter’s Value.” Compared to other high tech issues, the single tweet thing is indicative of the importance of a single action. According to the write up, Twitter did something. Nasdaq did something. A filtering outfit did something. Bingo. Stock goes down. The write up stated:

It has all left Twitter, which did not have great news to share with investors anyway, somewhat red-faced.

Yep, Twitter seemed concerned that whatever happened was not so good. Twitter did not demonstrate the same concern and alacrity when Beyond Search and Miley were exchanging bits. Why am I not surprised. A single tweet is really important when it costs Twitter money. Other misconnects in the Twitter system are not quite as important in my experience.

Stephen E Arnold, April 29, 2015

Visual Browsing: A New, Next, Big Thing. Maybe.

April 29, 2015

The visual browsing bandwagon is rolling along. The sponsored content Guardian in the UK published “Visual Browsing: There’s a Critical Gap between How We See and How We Search.” The write up, which seems to be supported by SAP, states:

What we need is a visual browser for the world around us – a way of pointing at things which inspire thoughts and questions, giving us a rich, engaging means to find out what we don’t know, and those things we didn’t know how to search for using mere words.

Right, words. The challenge according to Blippar, the outfit connected with this visual search, essay points out:

Visual browsing sits at the heart of discovery in the internet of everything. It has the potential to bring the world to life around us, adding a story to every thing we see and the ability to sate our curiosity in every moment. Visual browsing is the most ‘native’ search engine there is, being based on context alone, driven by visual cues, location, time of day and the interests of the user, and not biased or limited by the understanding or vocabulary of the user.This will give us the ability to satisfy our curiosity more of the time – to visually search for the answers to the questions that intrigue us every day; to truly take search into the realm of ‘discovery’. We’re the most curious of species on the planet – it’s what’s got us to where we are today. The next generation of search must reflect this.

Blippar allows a person to take a picture using a mobile phone and then having the picture generate results.


If you want to see examples of visual browsing, point your browser to This is the French Web search system owned in part by Axil Springer. For an example of a browser that itself incorporates visual browsing, download a copy of Vivaldi.

A picture, according to my somewhat addled great grandmother who wrote poetry with curse words as a metaphorical trope, is worth a thousand words. Here’s Qwant’s results for the query semantic search:


Visual browsing is one component of a next generation information access system, just not a main component. Clutter is not useful when certain types of information is required under difficult conditions such as a flash crash or someone is lobbing ordinance in your direction.

I am trying to figure out the SAP and Blippar connection. Will my mobile phone snap of the SAP logo help? I think not.

Stephen E Arnold, April 29, 2016

Recorded Future: The Threat Detection Leader

April 29, 2015

The Exclusive Interview with Jason Hines, Global Vice President at Recorded Future

In my analyses of Google technology, despite the search giant’s significant technical achievements, Google has a weakness. That “issue” is the company’s comparatively weak time capabilities. Identifying the specific time at which an event took place or is taking place is a very difficult computing problem. Time is essential to understanding the context of an event.

This point becomes clear in the answers to my questions in the Xenky Cyber Wizards Speak interview, conducted on April 25, 2015, with Jason Hines, one of the leaders in Recorded Future’s threat detection efforts. You can read the full interview with Hines on the Cyber Wizards Speak site at the Recorded Future Threat Intelligence Blog.

Recorded Future is a rapidly growing, highly influential start up spawned by a team of computer scientists responsible for the Spotfire content analytics system. The team set out in 2010 to use time as one of the lynch pins in a predictive analytics service. The idea was simple: Identify the time of actions, apply numerical analyses to events related by semantics or entities, and flag important developments likely to result from signals in the content stream. The idea was to use time as the foundation of a next generation analysis system, complete with visual representations of otherwise unfathomable data from the Web, including forums, content hosting sites like Pastebin, social media, and so on.

Recorded Future Interface

A Recorded Future data dashboard it easy for a law enforcement or intelligence professionals to identify important events and, with a mouse click, zoom to the specific data of importance to an investigation. (Used with the permission of Recorded Future, 2015.)

Five years ago, the tools for threat detection did not exist. Components like distributed content acquisition and visualization provided significant benefits to enterprise and consumer applications. Google, for example, built a multi-billion business using distributed processes for Web searching. integrated visualization into its cloud services to allow its customers to “get insight faster.”

According to Jason Hines, one of the founders of Recorded Future and a former Google engineer, “When our team set out about five years ago, we took on the big challenge of indexing the Web in real time for analysis, and in doing so developed unique technology that allows users to unlock new analytic value from the Web.”

Recorded Future attracted attention almost immediately. In what was an industry first, Google and In-Q-Tel (the investment arm of the US government) invested in the Boston-based company. Threat intelligence is a field defined by Recorded Future. The ability to process massive real time content flows and then identify hot spots and items of interest to a matter allows an authorized user to identify threats and take appropriate action quickly. Fueled by commercial events like the security breach at Sony and cyber attacks on the White House, threat detection is now a core business concern.

The impact of Recorded Future’s innovations on threat detection was immediate. Traditional methods relied on human analysts. These methods worked but were and are slow and expensive. The use of Google-scale content processing combined with “smart mathematics” opened the door to a radically new approach to threat detection. Security, law enforcement, and intelligence professionals understood that sophisticated mathematical procedures combined with a real-time content processing capability would deliver a new and sophisticated approach to reducing risk, which is the central focus of threat detection.

In the exclusive interview with, the law enforcement and intelligence information service, Hines told me:

Recorded Future provides information security analysts with real-time threat intelligence to proactively defend their organization from cyber attacks. Our patented Web Intelligence Engine indexes and analyzes the open and Deep Web to provide you actionable insights and real-time alerts into emerging and direct threats. Four of the top five companies in the world rely on Recorded Future.

Despite the blue ribbon technology and support of organizations widely recognized as the most sophisticated in the technology sector, Recorded Future’s technology is a response to customer needs in the financial, defense, and security sectors. Hines said:

When it comes to security professionals we really enable them to become more proactive and intelligence-driven, improve threat response effectiveness, and help them inform the leadership and board on the organization’s threat environment. Recorded Future has beautiful interactive visualizations, and it’s something that we hear security administrators love to put in front of top management.

As the first mover in the threat intelligence sector, Recorded Future makes it possible for an authorized user to identify high risk situations. The company’s ability to help forecast and spotlight threats likely to signal a potential problem has obvious benefits. For security applications, Recorded Future identifies threats and provides data which allow adaptive perimeter systems like intelligent firewalls to proactively respond to threats from hackers and cyber criminals. For law enforcement, Recorded Future can flag trends so that investigators can better allocate their resources when dealing with a specific surveillance task.

Hines told me that financial and other consumer centric firms can tap Recorded Future’s threat intelligence solutions. He said:

We are increasingly looking outside our enterprise and attempt to better anticipate emerging threats. With tools like Recorded Future we can assess huge swaths of behavior at a high level across the network and surface things that are very pertinent to your interests or business activities across the globe. Cyber security is about proactively knowing potential threats, and much of that is previewed on IRC channels, social media postings, and so on.

In my new monograph CyberOSINT: Next Generation Information Access, Recorded Future emerged as the leader in threat intelligence among the 22 companies offering NGIA services. To learn more about Recorded Future, navigate to the firm’s Web site at

Stephen E Arnold, April 29, 2015

Retail Feels Internet Woes

April 29, 2015

Mobile Web sites, mobile apps, mobile search, mobile content, and the list goes on and on for Web-related material to be mobile-friendly.  Online retailers are being pressured to make their digital storefronts applicable to the mobile users, because more people are using their smartphones and tablets over standard desktop and laptop computers.  It might seem easy to design an app and then people can download it for all of their shopping needs, but according to Easy Ask things are not that simple: “Internet Retailer Reveals Mobile Commerce Conversion Troubles.”

The article reveals that research conducted by Spreadshirt CTO Guido Laures shows that while there is a high demand for mobile friendly commerce applications and Web sites, very few people are actually purchasing products through these conduits.  Why?  The problem relates to the lack of spontaneous browsing and one the iPhone 6’s main selling features: a big screen.

“While mobile-friendly responsive designs and easier mobile checkouts are cited as inhibitors to mobile commerce conversion, an overlooked and more dangerous problem is earlier in the shopping process.  Before they can buy, customers first need to find the product they want.  Small screen sizes, clumsy typing and awkward scrolling gestures render traditional search and navigation useless on a smartphone.”

Easy Ask says that these problems can be resolved by using a natural language search application over the standard keyword search tool.  It says that:

“A keyword search engine leaves you prone to misunderstanding different words and returning a wide swath of products that will frustrate your shoppers and continue you down the path of poor mobile customer conversion.”

Usually natural language voice search tools misunderstand words and return funny phrases.  The article is a marketing tool to highlight the key features of Easy Ask technology, but they do make some key observations about mobile shopping habits.

Whitney Grace, April 29, 2015
Sponsored by, publisher of the CyberOSINT monograph

Cerebrant Discovery Platform from Content Analyst

April 29, 2015

A new content analysis platform boasts the ability to find “non-obvious” relationships within unstructured data, we learn from a write-up hosted at PRWeb, “Content Analyst Announces Cerebrant, a Revolutionary SaaS Discovery Platform to Provide Rapid Insight into Big Content.” The press release explains what makes Cerebrant special:

“Users can identify and select disparate collections of public and premium unstructured content such as scientific research papers, industry reports, syndicated research, news, Wikipedia and other internal and external repositories.

“Unlike alternative solutions, Cerebrant is not dependent upon Boolean search strings, exhaustive taxonomies, or word libraries since it leverages the power of the company’s proprietary Latent Semantic Indexing (LSI)-based learning engine. Users simply take a selection of text ranging from a short phrase, sentence, paragraph, or entire document and Cerebrant identifies and ranks the most conceptually related documents, articles and terms across the selected content sets ranging from tens of thousands to millions of text items.”

We’re told that Cerebrant is based on the company’s prominent CAAT machine learning engine. The write-up also notes that the platform is cloud-based, making it easy to implement and use. Content Analyst launched in 2004, and is based in Reston, Virginia, near Washington, DC. They also happen to be hiring, in case anyone here is interested.

Cynthia Murrell, April 29, 2015

Sponsored by, publisher of the CyberOSINT monograph


Next Page »

  • Archives

  • Recent Posts

  • Meta