CyberOSINT banner

Explaining Markov Chains

March 6, 2015

Do you know what a Markov chain is? If not read about “Markov Chains” on the Circuits of Imagination blog:

“A Markov chain is a set of transitions from one state to the next; Such that the transition from the current state to the next depends only on the current state, the previous and future states do not effect the probability of the transition. A transitions independence from future and past sates is called the Markov property.

This boils down to Markov chains are a way to explain patterns that happen over time and were once used to document human behavior. The chains are not the best way to model human behavior, because they only exist in the present. They do not take into account past or future experiences, otherwise called “memoryless.” The chains can only rely on the action that previously occurred

Markov chains are useful to identify abnormal behavior in systems that don’t exhibit the Markov Property. How? If the system keeps making the wrong decisions based of its program, then it can be diagnosed and repaired. The post explains how the Markov chains are used in coding and provides an example to illustrate how developers can recognize them.

Whitney Grace, March 06, 2015
Sponsored by, developer of Augmentext

A Message from the COO Connotate

March 6, 2015

Tom Williams is the chief operating officer of Connotate, the industry leading platform for harvesting and monetizing content. Like most organizations, Connotate is looking to attract attention and connect with people on a personal level by using social media. Videos have rapidly become the favored medium, because they are easily digestible.

On Connotate’s Youtube channel, there are two videos why content is important and how Connotate can help people use it to their advantage. The first video “Tom Williams, COO of Connotate: Why They Must Evolve” explains that old fashioned information methods no longer work in today’s world:

“Information service providers find themselves ill-equipped to deal with the challenges of incorporating web data. As you look at the scale from tens to hundreds to thousands, the old stare and compare, handwritten scripts, and off the shelf tools no longer apply. These techniques have a low barrier entry, but they don’t scale.

The next video “Tom Williams, COO of Connotate: Why Consider Connotate” explains how Connotate can help organizations harness their content and make money from it. These are powerful brief messages that get straight to the point of how Connotate can help.

Whitney Grace, March 06, 2015
Sponsored by, developer of Augmentext

Enterprise Search: Roasting Chestnuts in the Cloud

March 6, 2015

I read “Seeking Relevancy for Enterprise Search.” I enjoy articles about “relevance.” The word is ambiguous and must be disambiguated. Yep, that’s one of those functions that search vendors love to talk about and rarely deliver.

The point of the write up is that enterprise content should reside in the cloud. The search system can then process the information, build an index, and deliver a service that allows a single search to output a mix of hits.

Sounds good.

My concern is that I am not sure that’s what users want. The reason for my skepticism is that the shift to the cloud does not fix the broken parts of information retrieval. The user, probably an employee or consultant authorized to access the search system, has to guess which keywords unlock the information in the index.

Search vendors continue to roast the chestnuts of results lists, keyword search, and work arounds for performance bottlenecks. The time is right to move from selling chestnuts to those eager to recapture a childhood flavor and move to a more efficient information delivery system. Image source:

That’s sort of a problem for many searchers today. In many organizations, users express frustration with search because multiple queries are needed to find information that seems relevant. Then the mind numbing, time consuming drudgery begins. The employee opens a hit, scans the document, copies the relevant bit if it is noted in the first place, and pastes the item into a Word file or a OneNote type app and repeats the process. Most users look at the first page of results, pick the most likely suspect, and use that information.

No, you say.

I suggest you conduct the type of research my team and I have been doing for many years. Expert searchers are a rare species. Today’s employees perceive themselves as really busy, able to make decisions with “on hand” information, and believe themselves to be super smart. Armed with this orientation, whatever these folks do is, by definition, pretty darned good.

It is not. Just don’t try telling a 28 year old that she is not a good searcher and is making decisions without checking facts and assessing the data indexed by a system.

What’s the alternative?

My most recent research points to a relatively new branch or tendril of information access. I use the term “cyberosint” to embrace systems that automatically collect, analyze, and output information to users. Originally these systems focused on public content like Facebook, Twitter posts, and Web content. Now the systems are moving inside the firewall.

The result is that the employee interacts with reports generated with information presented in the form of answers, a map with dynamic data showing where certain events are now taking place, and in streams of data that go into other systems such as a programmatic trading application on Wall Street.

Yes, keyword search is available to these systems which can be deployed on premises, in the cloud, or in a hybrid deployment. The main point is that the chokehold of keyword search is broken using smart software, automatic personalization, and reports.

Keyword search is not an enterprise application. Vendors describe the utility function as the ringmaster of the content circus. Traditional enterprise search is like a flimsy card table upon which has been stacked a rich banquet of features and functions.

The card table cannot support the load. The next generation information access systems, while not perfect, represent a significant shift in information access. To learn more, check out my new study, CyberOSINT.

Roasting chestnuts in the cloud delivers the same traditional chestnut. That’s the problem. Users want more. Maybe a free range, organic gourmet burger?

Stephen E Arnold, March 6, 2015

AlchemyAPI: Beefing Up Watson

March 5, 2015

Why does Watson need beefing up? I have been inundated with information about Watson the game show winner, Watson the recipe maker, Watson the cancer fighter, and Watson the developer’s Eagle Scout.

IBM acquired a company involved in smart software and predictive analytics. That’s great for the owners of AlchemyAPI. I just am curious why the analytics tools IBM has developed itself, the SPSS toolset, the analytic components like WebFountain on the shelf in an IBM office somewhere are not enough.


At any rate, the news is presented in “IBM Buys AlchemyAPI to Boost Watson Computing Unit.” The write up reports in the best spirit of recycling IBM PR:

The purchase is designed to boost IBM’s push into more human-like computing services, based around its Watson technology, which can sift huge amounts of data, learn from the results and respond to spoken questions.

I quite like the phrase “IBM is trying to build a big business around Watson.”

No kidding. What does desperation smell like? The odor of cold muffins and warm laptops in a Manhattan office?

When it comes to delivering an integrated suite of service based on predictive analytics and other next generation goodies, I am not sure just buying stuff is the optimal approach. In my opinion, IBM seems to be struggling with the whole Watson offering. Executives unable to land deals with the velocity of Recorded Future, RedOwl, and some other hot outfits may believe that an acquisition is just what Dr. Watson needs.

We’ll see if a startup can power the new smart software economy which Google also covets and Hewlett Packard is chasing with the Autonomy black box of goodies.

Stephen E Arnold, March 5, 2015

MovieGraph from Senzari Offers a Better Way to Find Movies

March 5, 2015

The article on Dataversity titled Creating Detailed Semantic Graphs Around Video Content with MovieGraph suggests a possible breakthrough in video sense making. MovieGraph is the platform of entertainment data company Senzari. Chief Operating Officer Demian M. Bellumio spoke to the methods utilized by MovieGraph, which include machine learning and an API for recommendations. The article continues to refer to Bellumio’s statements,

“Senzari focused on metadata while building MovieGraph. He also said that Senzari trained machine learning algorithms to break down the narratives of movies, extracting the data with precision across each element. The company designed their own matrix for cataloging movies; MovieGraph uses machine learning techniques to semantically tag and organize every movie and TV show across hundreds of dimensions. Senzari also added proprietary narrative features to MovieGraph such as setting, conflict, symbols or tones present in a film.”

The possibilities for recommendations seem much more targeted than the Netflix model, which often makes suggestion based on categories that are too wide and abstracted to be accurate. The article mentions that since Netflix only recently closed its public API, MovieGraph may be in a position to fill that gap. MusicGraph is also built to work with MusicGraph, another Senzari platform. Content creators in particular might find the crossover to be useful in terms of finding appropriate content for their projects.

Chelsea Kerwin, March 05, 2015

Sponsored by, developer of Augmentext

Looking Towards 2015’s Data Trends

March 5, 2015

Here we go again! Another brand new year and it is time to predict where data will take us. For the past few years it has been all about the big data and while it has a solid base, other parts of the data science are coming into the limelight. While LinkedIn is a social network for professionals, one can also read articles on career advice, hot topics, and new trends in fields. Kurt Cagle is a data science expert and has written on the topic for over ten years. His recent article, “Ten Trends In Data Science In 2015” from December was posted on LinkedIn.

He calls the four data science areas the Data Cycle: analysis, awareness, governance, and acquisition. From Cagle’s perspective, 2014 saw big data has matured, data visualization software is in high demand, and semantics is growing. He predicts 2015 will hold much of the same:

“…with the focus shifting more to the analytics and semantic side, and Hadoop (and Map/Reduce without Hadoop) becoming more mainstream. These trends benefit companies looking for a more comprehensive view of their information environment (both within and outside the company), and represent opportunities in the consulting space for talented analysts, programmers and architects.”

Data visualization is going to get even bigger in the coming year. Hybrid data stores with more capabilities will become more common, semantics will grow even larger and specializing companies will be bought up, and there will be more competition for Hadoop. Cable also predicts work be done on a universal query language and data analytics are moving beyond the standard SQL.

His ending observations explain that data silos will be phased into open data platforms, making technology easier not just for people to use but also for technology to be compliant with each other.

Whitney Grace, March 05, 2015
Sponsored by, developer of Augmentext

SharePoint On Premises Is Alive

March 5, 2015

The recent news of the upcoming release of SharePoint 2016 has a lot of folks breathing a deep sigh of relief. On-premises support is important for a lot of organizations. Redmond Magazine covers the latest in their article, “SharePoint MVPs: ‘On-Prem is Very Much Alive and Well.’”

The article begins:

“A number of prominent SharePoint MVP experts say they are confident that the on-premises server edition of SharePoint has a long future despite Microsoft’s plans to extend the capabilities of its online counterpart — Office 365 — as well as options to host it in a public cloud service such as Azure. At the same time, many realize that customers are increasingly moving (or considering doing so) some or all of their deployments to an online alternative, either by hosting it in the cloud or moving to Office 365 and SharePoint Online.”

Recent news suggests that a preview of SharePoint 2016 will be available in May. Stephen E. Arnold runs a helpful news service,, devoted to all things search. His SharePoint coverage focuses on the latest tips, news, and tricks, and can be found on his dedicated SharePoint feed. It will be interesting to see the final details of SharePoint 2016 and how well it is received by the user community.

Emily Rae Aldridge, March 05, 2015

Good News for Textbook Publishers

March 4, 2015

I read “Students Reject Digital textbooks.” Textbook publishers have embraced slicing and dicing with alacrity. The idea is that a new textbook or collection of readings can be assembled with little input from a human editor. The future of automatically output texts seemed to be zeros and ones.

According to the write up some students are not too thrilled with digital textbooks. I know that find the iPad and Kindle a lousy way to read textbooks with illustrations, charts, and graphs. The iPad, for example, does not allow me to scale up an illustration in the reference book for Sony Vegas Professional. As a result, the illustrations are useless. A printed book delivers an image I can view. Score one for print in my experience.

The article reports:

As Good E Reader reports, a new survey by Student Monitor found that 87% of the students they spoke to preferred to buy or borrow textbooks as physical books. And a study from the University of Washington recently showed that one in four students who were given free digital textbooks still went out and bought a hard copy version, because they think it’s easier to take in information when they read it on paper as opposed to on a screen. And they’re probably right: last summer, a study found that readers absorb more information from paper books than from Kindles, and of course, if you’re up late studying, it’ll be easier to get to sleep afterward if you haven’t been staring at a backlit screen. I just hope that all these tech-eschewing students have got backpacks with strong shoulder straps.

Will textbooks become available in paper? Publishers want to make money, so paper may be the Bugatti Veryrons of education. Digital, despite its warts, may prove to be the easier path to textbook revenue. How does one search a textbook in paper? Not very easily, but the same statement applies to many digital volumes I am licensed to use. And learning? Publishing is usually about money I assert.

Stephen E Arnold, March 4, 2015

Silobreaker Forms Cyber Partnership with Norwich University

March 4, 2015

I learned that cyber OSINT capable Silobreaker has partnered with Silobreaker. Norwich, the oldest private military college in the US, has a sterling reputation for cyber security courses and degree programs. The Silobreaker online threat intelligence product will be used in the institution’s cyber forensics classes.

Silobreaker’s cyber security product automatically collects open source information from news, blogs, feeds and social media. The system provides easy to use tools and visualizations to make sense of the content.

Kristofer Månsson, CEO and Co-Founder of Silobreaker told Beyond Search:

By offering Silobreaker as part of their studies, Norwich University is addressing the need for a more holistic approach to threat intelligence in cyber security. This partnership showcases the power of Silobreaker to provide relevant context beyond the technical parameters of a threat, hack or a new malware. Understanding the threat landscape and anticipating potential risks will unquestionably also require the analysis of geopolitics, business and world events, which often influence and prompt attacks. We are excited to continue working with Norwich University and to open up the young minds of tomorrow to the ever-evolving cyber landscape.

Silobreaker is used by more than 80 Norwich students. The university offers the product across its cyber security classes including Cyber Criminalistics, Cyber Investigation and Network Forensics. Students learn how to apply Silobreaker’s next generation system to intelligence gathering in the context of their investigations. Students are required to use the technology throughout their independent research projects.

Aron Temkin, dean of the College of Professional Schools said:

In order to maintain our excellence in cyber security research and training, we need to stay on top of the latest emerging technologies. Silobreaker is a powerful tool that is both user-friendly and flexible enough to fit within our cyber education programs.

Dr. Peter Stephenson, director of the university’s Center for Advanced Computing and Digital forensic added:

Students can get useful output quickly, and we do not have to turn a semester forensics class into a ‘How To Use Silobreaker’ session. Cyber events do not occur in a vacuum. There is context around them that often is hard to see. Silobreaker solves that. It cuts through the mass of information available on the Internet and helps our students get to the meat of an issue quickly and with a variety of ways of accessing and displaying it. This is a new way to look at cyber forensics.


Silobreaker is a data analytics company specializing in cyber security and risk intelligence. The company’s products help intelligence professionals to make sense of the overwhelming amount of data available today on the web. Silobreaker collects large volumes of open source data from news, blogs, feeds and social media and provides the tools and visualizations for analyzing and contextualizing such data. Customers save time by working more efficiently through big data-sets and improve their expertise and knowledge from examining and interpreting the data more easily. For more information, navigate to

Interviews with Silobreaker’s Mat Bjore are available via the free Search Wizards Speak service.

Opening Watson to the Masses

March 4, 2015

IBM is struggling financially and one of the ways they hope to pull themselves out of the swamp is to find new applications for its supercomputers and software. One way they are trying to cash in on Watson is to create cognitive computer apps. EWeek alerts open source developers, coders, and friendly hackers that IBM released a bunch of beta services: “13 IBM Services That Simplify The Building Of Cognitive Watson Apps.”

IBM now allows all software geeks the chance to add their own input to cognitive computing. How?

“Since its creation in October 2013, the Watson Developer Cloud (WDC) has evolved into a community of over 5,000 partners who have unlocked the power of cognitive computing to build more than 6,000 apps to date. With a total of 13 beta services now available, the IBM Watson Group is quickly expanding its developer ecosystem with innovative and easy-to-use services to power entirely new classes of cognitive computing apps—apps that can learn from experience, understand natural language, identify hidden patterns and trends, and transform entire industries and professions.”

The thirteen new IBM services involve language, text processing, analytical tools, and data visualization. These services can be applied to a wide range of industries and fields, improving the way people work and interact with their data. While it’s easy to imagine the practical applications, it is still a wonder about how they will actually be used.

Whitney Grace, March 04, 2015
Sponsored by, developer of Augmentext

Next Page »