CyberOSINT banner

Photo Farming in the Early Days

November 9, 2015

Have you ever wondered what your town looked like while it was still urban and used as farmland?  Instead of having to visit your local historical society or library (although we do encourage you to do so), the United States Farm Security Administration and Office Of War Information (known as  FSA-OWI for short) developed Photogrammer.  Photogrammer is a Web-based image platform for organizing, viewing, and searching farm photos from 1935-1945.

Photogrammer uses an interactive map of the United States, where users can click on a state and then a city or county within it to see the photos from the timeline.  The archive contains over 170,000 photos, but only 90,000 have a geographic classification.  They have also been grouped by the photographer who took the photos, although it is limited to fifteen people.  Other than city, photographer, year, and month, the collection c,an be sorted by collection tags and lot numbers (although these are not discussed in much detail).

While farm photographs from 1935-1945 do not appear to need their own photographic database, the collection’s history is interesting:

“In order to build support for and justify government programs, the Historical Section set out to document America, often at her most vulnerable, and the successful administration of relief service. The Farm Security Administration—Office of War Information (FSA-OWI) produced some of the most iconic images of the Great Depression and World War II and included photographers such as Dorothea Lange, Walker Evans, and Arthur Rothstein who shaped the visual culture of the era both in its moment and in American memory. Unit photographers were sent across the country. The negatives were sent to Washington, DC. The growing collection came to be known as “The File.” With the United State’s entry into WWII, the unit moved into the Office of War Information and the collection became known as the FSA-OWI File.”

While the photos do have historical importance, rather than creating a separate database with its small flaws, it would be more useful if it was incorporated into a larger historical archive, like the Library of Congress, instead of making it a pet project.

Whitney Grace, November 9, 2015

Sponsored by, publisher of the CyberOSINT monograph

ZL Technologies: From Ziplip to Enterprise Search

November 5, 2015

Ziplip opened for business in 1999. That works out to 16 years ago. I looked at the company’s archiving technology when I did a comparison between Ziplip and Index Engines, an outfit which has some tendrils originating at the post Judge Green Bell Labs.

I took another look at Ziplip, respoitio0ned as ZL Technologies, in 2009. ZL had a bone to pick with a mid tier consulting firm. Complaining about mid tier consulting firms, their approach to analysis, and the business models is a game some vendors play. The vendor believes it should be a highly rated, but the vendors gets low marks. Aggrieved the vendor complains about the mid tier consulting firm.

I thought about ZL when I read this item, “ZL Technologies to Establish the ROI of Information Governance at Enterprise Search and Discovery Conference 2015.” What I found interesting is that Ziplip has allegedly solved a problem which has given headaches to licensees of search and content processing systems; namely, laying out a method for calculating “true ROI.” I assume that regular MBA ROI is not going to do the job. Hence, we have the “true value” angle.

This paragraph caught my attention as well:

For enterprise-scale organizations, the difficulty in calculating true numerical ROI for data management initiatives has traditionally posed a major roadblock to planning and securing funding for governance architecture. This has been especially true for firms driven by quarterly performance; given specific requirements and constrained budget, this has often resulted in an ad hoc “point solution” approach, spawning multiple data silos and paradoxically increasing the overall long-term cost of information governance. The session hosted by ZL Technologies takes a strategic approach to calculating true ROI, examining oft-neglected factors and broad interdepartmental benefits of holistic governance practices.

I am old fashioned and think that ROI can be a slippery fish. Here’s a basic definition:

A profitability measure that evaluates the performance of a business by dividing net profit by net worth .

The key seems to be how one captures cost, converts the fuzzy notions into more numbers, and then using a mathematical procedure baked into Excel. Hey, it’s not perfect, but it is close enough for horses shoes.

The key, of course, is the assumptions for the calculation, the process for capturing and verifying the data, and the methodology to pin down the “worth” and “value” generalities. In short, spending money on search requires that a wide range of direct and indirect costs be captures, diligence to ensure that downstream costs are collected, and that the assumptions line up with the numerical recipe.

What has baffled me about ZLTech’s approach is that the approach is based on “information governance.” I don’t know what that means. Furthermore, I am not sure how an archive converts to enterprise search. What happens to the social media, the videos, and the images.

My hunch is that ZL is mounting a marketing campaign and using as many buzzwords as possible. Will MBA classes embrace the ZL approach to “true worth”?

Nope. After 16 years, a revolutionary value method has had plenty of time to filter into the mainstream of ROI methodology.

Stephen E Arnold, November 5, 2015

Google to the French: Wrong to Be Forgotten

July 31, 2015

i read “Google Says Non to French Demand to Expand Right to Be Forgotten Worldwide.” When third parties want the GOOG to do something, those suggestions face headwinds. It is okay for the Google to terminate unused Gmail accounts. It is okay for the Google to nuke APIs. It is okay for the Google to deliver “relevant” results which are beyond the statistical embrace of precision and recall analyses.

But when a third party wants to be forgotten? According to the write up from the increasingly anti Google folks in the UK, I learned:

Google has rejected the French data protection authority’s demand that it censor search results worldwide in order to comply with the European Court of Justice’s so-called right to be forgotten ruling. The company’s rejection of the ruling could see its French subsidiary facing daily fines, although no explicit sanction has yet been declared.

The write up also reminded me of Google’s official view of third party requests to be forgotten:

In a blog post, Peter Fleischer, Google’s Global Privacy Counsel, said: “We believe this order is disproportionate and unnecessary, given that the overwhelming majority of French internet users – currently around 97% – access a European version of Google’s search engine like, rather than or any other version of Google.” Additionally, Fleischer added, the company is concerned that complying with the French courts could potentially set a precedent that one country’s laws can control access to content globally.

My hunch is that Google wants its policies and procedures applied globally. Google has suggested that some nation states alter their behavior to better mesh with the Googley universe.

Standing by for more Google vs. France dust ups.

Stephen E Arnold, July 31, 2015

SharePoint Gets Serious with Information Governance

March 19, 2015

SharePoint has enjoyed continued success over the last 15 years, but it has not been without some bumps along the way. Information governance is one of the noted areas in which Share has fallen flat. Read more in the CMS Wire article, “Keeping SharePoint In Check with Information Governance.”

The article begins:

“Historically, SharePoint was thought to cause as many information governance problems as it solved. The 2001 to 2003 versions did not show Microsoft putting much effort into helping customers with information governance. But after the massive take up of SharePoint Portal Server 2007 licenses, and the often negative conversations coming out of the sizable SharePoint user community, Microsoft started to take governance issues seriously.”

In addition to keep an eye on your news feed for the latest SharePoint buzz, staying tuned to experts in the field is a great way to save time and get pointed information pertaining to improving a SharePoint installation. Stephen E. Arnold has one such SharePoint feed on his Web site, Focusing on tips, tricks, and news, Arnold collocates much of content that users and managers alike will find helpful for navigating day-to-day SharePoint operations.

Emily Rae Aldridge, March 19, 2015

Stephen E Arnold, Publisher of CyberOSINT at

Hewlett Packard: The Dodge Em Method

February 27, 2015

I read an interesting article called “Hewlett Packard Tries to Duck Investors with Virtual Meeting.” I thought it was hip to do meetings via Skype and the plug in on my Web site. Guess not.

The write up makes a point that I don’t consider when firing up the tele-meeting software. Here’s the passage I noted:

Hewlett Packard’s recent decision to ditch its annual shareholder meeting in favor of a virtual one is just bad corporate governance. The forum gives ordinary shareholders their one chance each year to directly question and even confront the CEO and board of directors. And when it comes to HP, investors should be asking plenty of questions.

Ah, corporate governance. I thought this was an area reserved for Wharton business school instructors. You know, Wharton, one of the fonts of management perspicuity for eager consultants and CEOs to be. (I wonder what “governance” means: Good decision making, prudent use of financial resources, innovating, generating sustainable revenue?)

The article points out the MBA type reasoning that HP management seems to be using. There’s a reference to the Autonomy flap, cost savings, and, of course, the somewhat lackluster financial performance.

I don’t agree with this statement:

Under Whitman, a former eBay CEO, HP has stabilized.

Like IBM, these large “information technology” companies are a bit like a whale stuck in a small bay. Everyone arrives to help, but in most cases, there is not much to be done. A confused whale is pretty much a challenge for everyone involved. When a whale thrashes before its death, I want to be standing well away from the creature.

I suppose that’s why I am confused about what HP is doing with the Autonomy technology. Some of the zeros and ones date from the mid 1990s. I don’t drive a 25 year old automobile. HP apparently plans to sell some.

Stephen E Arnold, February 27, 2015

Information Governance Standards Group Suggests Caution in Approaching eDiscovery

November 14, 2014

The records management group ARMA International weighs in about search with an article in their Information Management magazine: “Enterprise Search vs E-Discovery Search: Same or Different?” The short answer, not surprisingly, is “different.” Writer Kamal Shah explains:

“To date, most enterprises have used the same search technologies for both tasks. However, a recent trend among large and small enterprises suggests that a significant divergence is occurring between enterprise searches and e-discovery searches. Both start by entering a search term in a search box, but that’s where the similarities end. The business requirements are different and, as a result, each needs different capabilities.”

The article goes on to elaborate on the reasons traditional enterprise search is not sufficient for most eDiscovery needs. For example, while a regular enterprise user may be looking for the top five or 10 documents that relate to a search term, a firm performing an eDiscovery search in response to litigation must turn up all relevant documents (while minimizing irrelevant clutter.) Users of eDiscovery must also be prepared to prove in court that they followed best practices in assembling their data. Shah summarizes:

“Conducting e-discovery for litigation or an investigation using enterprise search technology is a risky gamble that can result in negative outcomes in court, penalties, and excessive litigation costs.”

See the article for more details, but the upshot is clear: eDiscovery is an environment where it is becoming increasingly crucial to use the right tool for the data-digging job.

Cynthia Murrell, November 14, 2014

Sponsored by, developer of Augmentext

Government Web Site Reliability

August 21, 2014

I read “IT Outages Are an Ongoing Problem for the US Government.” I was surprised if the information is accurate. The article reports:

When outages occur, 48% of the workers said they do what they can via telephone, while 33% use personal devices and another 24% try to find a workaround, such a Google Apps. When asked to grade their IT department, only 15% of the field workers gave it an “A”; 49% gave it a “B”; and 27% gave it a “C.” When asked what caused the most recent outages, the IT professionals said 45% were due to a network or server outage; 20% cited Internet connectivity loss; 13% blamed natural disaster; 7% said a specific application stopped working, and 6% pointed to human error.

With the new push to improve government Web sites, perhaps the core infrastructure needs attention as well? Is it possible that good enough is comparable to the US broadband capability, the educational system, or airline on time performance? And search results? Nah,’s search results are good enough for some.

Stephen E Arnold, August 21, 2014

SharePoint Information Governance Concerns

April 23, 2014

Most users of SharePoint know about the struggles and concerns of governance. CMS Wire covers the issue in their article, “The SharePoint Information Governance Problem.”

Speaking to those experienced with using SharePoint as a document management platform, the article begins:

“You’re also likely familiar with the negative impacts that typically result from using SharePoint ineffectively: a proliferation of sites, often on a proliferation of SharePoint versions, with no clear standards on what documents should (and shouldn’t) be stored there or how, no clear guidelines for users on how to classify their documents, little to no capabilities for promoting effective information lifecycle management, little to no end user governance or oversight for things like site and document library structures, security and access settings, or document hygiene, and dozens, hundreds or even thousands of orphaned sites that, taken together, represent a digital landfill of staggering proportions.”

The article then goes on to assert that most of these issues are due to SharePoint’s lack of ease of use. This is a topic that Stephen E. Arnold often covers on his information site, Specializing in all forms of search, Arnold has a lifetime of experience. Tune in to his SharePoint feed for tips and tricks on increasing ease of use.

Emily Rae Aldridge, April 23, 2014

Avoiding SharePoint Governance Mistakes

April 9, 2014

SharePoint governance is a big topic for most organizations. A panel of experts from Avanade, HiSoftware, Portal Solutions and Metalogix tackled the issue in a recent webinar. CMS Wire gives all the details in their article, “How to Avoid SharePoint Governance Mistakes.”

The author writes:

“If you’re wondering what your SharePoint governance plan should look like, look around you. It should probably look a lot like your organization.

There’s no such thing as a one-size-fits-all approach, even if you’re in an highly regulated industry like healthcare of financial services that imposes strict regulations on information sharing.”

Stephen E. Arnold knows all to well the difficulty surrounding SharePoint governance. He is a longtime search expert, and often covers SharePoint issues on his Web site Webinars, training, and services like are important resources for enterprise managers as they seek to balance the needs of their organization.

Emily Rae Aldridge, April 9, 2014

SharePoint Governance Woes and Solutions

February 5, 2014

SharePoint governance is an important aspect that’s often overlooked. And while developing a plan on the front is often hard and requires a lot of time, having some well-thought-out in place is invaluable. Read more in the Search Content Management article, “SharePoint Governance Plans Doomed Without Business Buy-in.”

The article begins:

“SharePoint implementers are often stymied when attempting to bring discipline to business processes so that SharePoint can be an effective tool in the first place. It’s about bringing order to chaos, noted Sue Hanley, founder and president of Susan Hanley LLC, in her session on developing SharePoint governance plans at SPTechCon 2013 this week. Failure to design information governance into SharePoint implementation plans can lead to deployments that resemble the ‘Wild, Wild West,’ Hanley said.”

Stephen E. Arnold is a longtime leader in search and the man behind He spends a lot of attention on SharePoint and governance is not an infrequent topic. If a plan is not in place, extensive customization and fancy training will not do an organization any good. It all begins and ends with a smart plan.

Emily Rae Aldridge, February 5, 2014

Next Page »