CyberOSINT banner

Research Like the Old School

April 24, 2015

There was a time before the Internet that if you wanted to research something you had to go to the library, dig through old archives, and check encyclopedias for quick facts.  While it seems that all information is at your disposable with a few keystrokes, but search results are often polluted with paid ads and unless your information comes from a trusted source, you can’t count it as fact.

LifeHacker, like many of us, knows that if you want to get the truth behind a topic, you have to do some old school sleuthing.  The article “How To Research Like A Journalist When The Internet Doesn’t Deliver” drills down tried and true research methods that will continue to withstand the sands of time or the wrecking ball (depending on how long libraries remain brick and mortar buildings).

The article pushes using librarians as resources and even going as far as petitioning government agencies and filing FOIA requests for information.  When it makes the claim that some information is only available in person or strictly for other librarians, this is both true and false.  Many libraries are trying to digitize their information, but due to budgets are limited in their resources.  Also unless the librarian works in a top secret archive, most of the information is readily available to anyone with or without the MLS degree.

Old school interviews are always great, especially when you have to cite a source.  You can always cite your own interview and verify it cam straight from the horse’s mouth.  One useful way to team the Internet with interviews is tracking down the interviewees.

Lastly, this is the best piece of advice from the article:

“Finally, once you’ve done all of this digging, visited government agencies, libraries, and the offices of the people with the knowledge you need, don’t lose it. Archive everything. Digitize those notes and the recordings of your interviews. Make copies of any material you’ve gotten your hands on, then scan them and archive them safely.”

The Internet is full of false information.  By placing a little more credence out there, will make the information more safe to use or claim as the truth.

These tips are useful, even if a little obvious, but they however still fail to mention the important step that all librarians know: doing the actual footwork and proper search methods to find things.

Whitney Grace, April 24, 2015

Sponsored by, publisher of the CyberOSINT monograph

IBM Provides Simple How-To Guide for Cloudant

April 24, 2015

The article titled Integrate Data with Cloudant and CouchDB NoSQL Database Using IBM InfoSphere Information Server on IBM offers a breakdown of the steps necessary to load JSON documents and attachments to Cloudant. In order to follow the steps, the article notes that you will need Cloudant, CouchDB, and IBM InfoSphere DataStage. The article concludes,

“This article provided detailed steps for loading JSON documents and attachments to Cloudant. You learned about the job design to retrieve JSON documents and attachments from Cloudant. You can modify the sample jobs to perform the same integration operations on a CouchDB database. We also covered the main features of the new REST step in InfoSphere DataStage V11.3, including reusable connection, parameterized URLs, security configuration, and request and response configurations. The JSON parser step was used in examples to parse JSON documents.”

Detailed examples with helpful images guide you through each part of the process, and it is possible to modify the examples for CouchDB. Although it may seem like a statement of the obvious the many loyal IBM users out there, perhaps there are people who still need to be told. If you are interested in learning the federation of information with a logical and simple process, use IBM.

Chelsea Kerwin, April 24, 2014

Sponsored by, publisher of the CyberOSINT monograph


Social Network Demographics by the Numbers

April 23, 2015

The amount of social networking Web sites and their purposes is as diverse as the human population.  Arguably, if you were to use each of the most popular networks and try to keep on top of every piece of information that filters through the feed, one twenty-four hour day would not be enough.

With social media becoming more ingrained in daily life, it makes one wonder who is using what network and for what purpose.  Business Insider discusses a recent BI Intelligence about social media demographics in the article: “Revealed: A Breakdown Of The Demographics For Each Of The Social Networks.”  Here are some of the facts: Facebook is still mostly female and remains the top network.  Twitter leans heavier on the male demographic, while YouTube reaches more adults in 18-34 demographic than cable TV.  Instagram is considered the most important of teenage social networks, but Snapchat has the widest appeal amongst the younger crowd.  This is the most important for professionals:

LinkedIn is actually more popular than Twitter among U.S. adults. LinkedIn’s core demographic are those aged between 30 and 49, i.e. those in the prime of their career-rising years. Not surprisingly, LinkedIn also has a pronounced skew toward well-educated users.”

Facebook still reigns supreme and pictures are popular with the younger sect, while professionals all tend to co-mingle in their LinkedIn area.  Surprising and not so revealing information, but still interesting for the data junkie.  We wonder how social media will change in the coming year?

Whitney Grace, April 23, 2015

Sponsored by, publisher of the CyberOSINT monograph

Oracle Challenges HP Autonomy Service

April 22, 2015

The article titled Oracle Adds Big Data Integration Tool To Streamline Hadoop Deployments on Silicon Angle discusses the news from Oracle that follows its determination that putting the right tools before users is the only way to allow for success. The Data Integrator for Big Data is meant to create more opportunities to pull data from multiple repositories by treating them the same. The article states,

“It’s an important step the company insists, because Big Data tools like Hadoop and Spark use languages like Java and Python, making them more suitable for programmers rather than database admins (DBAs). But the company argues that most enterprise data analysis is carried out by DBAs and ETL experts, using tools like SQL. Oracle’s Big Data integrator therefore makes any non-Hadoop developer “instantly productive” on Hadoop, added Pollock in an interview with PC World.”

Pollock also spoke to Oracle’s progress, claiming that they are the only company with the capability to generate Hive, Pig and Spark transformations from a solitary mapping. For customers, this means not needing to know how to code in multiple programming languages. HP is also making strides in this line of work with the recent unveiling of the software that integrates Vertica with HP Autonomy IDOL. Excitement ahead!

Chelsea Kerwin, April 22, 2014

Stephen E Arnold, Publisher of CyberOSINT at

Yahoo News Off the Rails

April 21, 2015

The article titled Purple Reign on The Baffler tells the story of the derailment of Yahoo News. The author, Chris Lehmann, exerts all of his rhetorical powers to convey his own autobiography of having served as a Yahoo News editor after being downsized from a more reputable publication, along with any number of journalists and editors. The main draw was that Yahoo News was one of the few news organizations that were not bankrupt. In spite of being able to produce some high-caliber news, writers and editors at Yahoo were up against a massive bureaucracy that at its best didn’t understand the news and at its worst didn’t trust the news. For example, the author relays the story of one piece he posted on militia tactics of ambushing police by breaking the law,

“Before the post went live, I fielded an anxious phone call from a senior manager in Santa Monica. He was alarmed… for a simple reason: “I haven’t heard of this before.” I struggled to find a diplomatic way to explain that publishing things that readers hadn’t heard before was something that a news organization should be doing a whole lot more of: it was, in fact, the definition of “news.”

One of the saddest aspects of the corporate-controlled news outreach was the attempt to harness the power of the traffic on Yahoo’s site by making all internet users reporters. Obvious to anyone who has ever read a comment section online, web users range from the rational to the bizarrely enraged to the racist/sexist/horrifying. Not long after this Ask America initiative tanked, Lehmann’s job description was “overhauled” and he resigned.
Chelsea Kerwin, April 21, 2014

Stephen E Arnold, Publisher of CyberOSINT at

What is the Depth of Deep Linking?

April 21, 2015

One the Back Channel blog of, an article called “Will Deep Links Ever Truly Be Deep?” discusses the hot topic of how apps are trying to forge “deep” connections with each other, by directing linking to each other rather than the fragmented jumping between apps users have to suffer through.  The article points out that this is not a current trend, in fact it has been going on since the 1990s (did they even know what an app was back then?).  In the 1990s, deep links dealt with hopping from one Web site to another.  It makes the astute observation that, as users, we leave behind data mined by service providers for a profit and our digital floundering could be improved.

“ Chris Maddern is cofounder of Button, one of several companies that have set out to make deep links work in the land of apps, and he talks with rapid precision about the sorry state of mobile interoperability today.”

‘Right now it’s no secret that the Internet’s paid for basically by big companies buying tiny time-slices of your eyeballs against your will,’ Maddern says. Button wants to change that by “capturing users’ intent.” For instance, you’re reading a New York Times travel story about Barcelona. You want to book an Airbnb there pronto. On your phone, you’d have to exit your New York Times app, then start up your Airbnb app and search for Barcelona in it. In a Web browser, you could have clicked straight through from one site to the other?—?and landed directly on a page of Barcelona listings.”

It goes on to discuss the history of deep links, the value of our information, and how mobile apps are trying to create the seamless experience we have in a regular browser.  The problem, however, appears to be that app developers like major companies do not want to play nicely together, so we have the fragmented the experience.  The bigger issue at hands is the competition!  Developers claim they are building the deep links described in the article, but they are not.  App use is more about profit than improving content value.

Whitney Grace, April 21, 2015
Stephen E Arnold, Publisher of CyberOSINT at

Expert System Webinar: Sharepoint and Semantics Add Value for Users

April 20, 2015

Expert System offers a system capable of turbo-charging information access in SharePoint installations. The company has developed a fact-based webinar to demonstrate the power of Expert System’s semantic technology.

The company’s Cogito Connected for SharePoint features a document library, complete with metadata enrichment for files to increase their visibility as well as their content. The library will also be retained in SharePoint and be available for use by other files and accurate time and date of most recent tagging will be captured for each file. Users will also be able to process multiple attachments in the Document List and the search function is enhanced with fully integrated Web components.

With Cogito, users can locate content via a custom taxonomy, entities, or faceted search options. SharePoint users can locate needed information via point-and-click, eDiscovery, and traditional keyword search enriched with organization-specific metadata. Expert System’s Cogito allows users to browse content organized by topics, people, and concepts, which makes SharePoint more useful to a busy professional.

SharePoint is one of the most popular collaborative content platforms for enterprise systems, but like many proprietary software programs it has its limits. The good news is that companies like Expert System discover SharePoint’s weaknesses and create solutions to fix them.

Using its patented technology Cogito, Expert System addresses one of the main user concerns when looking for information housed in SharePoint. Cogito sharply reduces the difficulty of navigating and locating content in SharePoint. This problem stems from creators improperly tagging content or not tagging it at all.

In an exclusive interview, Maurizio Mencarini, Expert System had this to say about Expert System’s Cogito Connected for SharePoint:

“Cogito Connected for SharePoint addresses these two areas by providing the power of Cogito semantics to the application of consistent, automated tagging of SharePoint content. With the addition of fully integrated web parts that expose the granularity of content generated metadata, Cogito enhanced SharePoint optimizes the management of content for the SharePoint administrator. For the user, Cogito Connected for SharePoint significantly improves the SharePoint search experience by enhancing the search capabilities beyond the list to include faceted search including category, entity and topic.”

Expert System’s solution delivers a better SharePoint experience for the user and improves work productivity for employees, since they will be able to locate information quicker. Expert System knows what many users don’t realize: the value of being able to locate and recognize content quickly. In this case, Expert System applied this knowledge to SharePoint, but it can be used for other programs in any field. On April 28, 2015 from 12:00 PM-1:00 PM EST, Expert System will host a free webinar called “Implementing a Better Search Experience” where attendees will “learn how to make SharePoint more than a place where you put documents and start transforming your collected knowledge in your collective knowledge.”

Expert System was founded in 1989 and its flagship product is Cogito. Solutions based on the Cogito software include semantic search, natural language search, text analytics, development and management of taxonomies and ontologies, automatic categorization, extraction of data and metadata, and natural language processing. Expert System is working on exciting new developments on everything from enterprise systems to security and intelligence.

Expert System wants to share its knowledge with users so they can have a better user experience, apply the knowledge to other areas, and, of course, make daily tasks simpler.

The new “Implementing a Better Search Experience” will be offered on April 28, 2015, from 12 to 1 pm Eastern Time. You will learn how you can transform your organization’s collected knowledge in actionable collective knowledge.

Sign up for the April webinar at

Stephen E Arnold, April 20, 2015

Google Has a Problem: A Monopoly on Data, Not Traffic, Data

April 20, 2015

Leave it to the complainers in the UK to accuse Google of having a monopoly on data. Navigate to “Google Dominates Search. But the Real Problem Is Its Monopoly on Data.” Note that there are some outfits in the UK which have quite a bit of data too. The difference is that Google appears to be free, and the UK outfit is sort out of the spotlight.

The write up jumps from the allegations under consideration by the European Commission about Google’s search results. The write up states:

Were Google a manufacturer, say, a monopoly such as it has over internet search would never be allowed. But three factors conspire to Google’s advantage. Firstly, digital services, however ubiquitous, seem less tangible and therefore do not appear so obvious a threat to commercial pluralism, innovation and to consumer interests.

Okay, no monopolies allowed. No kilt wool combines. No champagne controls in quirky France. No centralization of Mercedes Benz parts. I understand.

To its credit, the Guardian points out that an alternative to Google is just a click away. The reality is different. Ask a shrink about habits. I highlighted this paragraph:

The wider problem is that Google has become the ultimate monopolist of the information age. Information is a source of power, and nothing in the EU’s case does anything significant to touch that power.

Good point. So isn’t the war over? Research that question in Qwant.

Stephen E Arnold, April 20, 2015

Improving the Preservica Preservation Process

April 17, 2015

Preservica is a leading program for use in digital preservation, consulting, and research, and now it is compatible with Microsoft SharePointECM Connection has the scoop on the “New Version Of Preservica Aligns Records Management And Digital Preservation.”  The upgrade to Preservica will allow SharePoint managers to preserve content from SharePoint as well as Microsoft Outlook, a necessary task as most companies these days rely on the Internet for business and need to archive transactions.

Preservica wants to become a bigger part of enterprise system strategies such as enterprise content management and information governance.  One of their big selling points is that Preservica will archive information and keep it in a usable format, as obsoleteness becomes a bigger problem as technology advances.

“Jon Tilbury, CEO Preservica adds: ‘The growing volume and diversity of digital content and records along with rapid technology and IT refresh rates is fuelling the need for Records and Compliance managers to properly safe-guard their long-term and permanent digital records by incorporating Digital Preservation into their overall information governance lifecycle. The developing consensus is that organizations should consider digital preservation from the outset – especially if they hold important digital records for more than 10 years or already have records that are older than 10 years. Our vision is to make this a pluggable technology so it can be quickly and seamlessly integrated into the corporate information landscape.’ ”

Digital preservation with a compliant format is one of the most overlooked problems companies deal with.  They may have stored their records on a storage device, but if they do not retain the technology to access them, then the records are useless.  Keeping files in a readable format not only keeps them useful, but it also makes the employee’s life who has to recall them all the easier.

Whitney Grace, April 17, 2015
Stephen E Arnold, Publisher of CyberOSINT at

NSF Makes Plan for Public Access to Scientific Research

April 16, 2015

The press release on the National Science Foundation titled National Science Foundation Announces Plan for Comprehensive Public Access to Research Results speaks to the NSF’s interest in increasing communications on federally funded research. The NSF is an independent federal agency with a 7 billion dollar annual budget that is dispersed around the country in the form of grants to fund research and education in science and engineering. The article states,

“Scientific progress depends on the responsible communication of research findings,” said NSF Director France A. Córdova…Today’s announcement follows a request from the White House Office of Science and Technology Policy last year, directing science-funding agencies to develop plans to increase access to the results of federally funded research. NSF submitted its proposal to improve the management of digital data and received approval to implement the plan.”

The plan is called Today’s Data, Tomorrow’s Discoveries and promotes the importance of science without creating an undue burden on scientists. All manuscripts that appear in peer-reviewed scholarly journals and the like will be made available for free download within a year of the initial publication. In a time when scientists are less trusted and science itself is deeply misunderstood, public access may be more important than ever.


Chelsea Kerwin, April 16, 2014

Stephen E Arnold, Publisher of CyberOSINT at

Next Page »