CyberOSINT banner

Amazon Updates Sneaker Net

October 8, 2015

I remember creating a document, copying the file to a floppy, and then walking up one flight of steps to give the floppy to my boss. He took the floppy, loaded it into his computer, and made changes. A short time later he would walk down one flight of steps, hand me the floppy with his file on it, and I would review the changes.

I thought this was the cat’s pajamas for two reasons:

  1. I did not have to use ledger paper, sharpen a pencil, and cramp my fingers
  2. Multiple copies existed so I no longer had to panic when I spilled my Fresca across my desk.

Based on the baloney I read every day about the super wonderful high speed, real time cloud technology, I was shocked when I read “Snowball’s Chance in Hell? Amazon Just Launched a Physical Data Transfer Service.” The news struck me as more important than the yap and yammer about Amazon disrupting cloud business and adding partners.

Here’s the main point I highlighted in pragmatic black:

A Snowball device is ordered through the AWS Management Console and is delivered to site within a few days; customers can order multiple devices and devices can be run in parallel. Described as coming in its “own shipping container” (it doesn’t require packing or unpacking) the Snowball is entirely self-contained, complete with 110 Volt power, a 10 GB network connection on the back and an E Ink display/control panel on the front. Once received it’s simply a matter of plugging the device in, connecting it to a network, configuring the IP address, and installing the AWS Snowball client; a job manifest and 25 character unlock code complete the task. When the transfer of data is complete the device is disconnected and a shipping label will automatically appear on the E Ink display; once shipped back to Amazon (currently only the Oregon data center is supporting the service, with others to follow) the data will be decrypted and copied to S3 bucket(s) as specified by the customer.

There you go. Sneaker net updated with FedEx, UPS, or another shipping service. Definitely better than carrying an appliance up and down stairs. I was hoping that individuals participating in the Mechanical Turk system would be available to pick up an appliance and deliver it to the Amazon customer and then return the gizmo to Amazon. If Amazon can do Etsy-type stuff, it can certainly do Uber-type functions, right?

When will the future arrive? No word on how the appliance will interact with Amazon’s outstanding search system. I wish I knew how to NOT out unpublished books or locate mysteries by Japanese authors available in English. Hey, there is a sneaker net. Focus on the important innovations.

Stephen E Arnold, October 8, 2015 The Search for Functional Management via Training

September 21, 2015

I read “How Botched $600 Million worth of Contracts.” My initial reaction was that the $600 million figure understated the fully loaded costs of the Web site. I have zero evidence about my view that $600 million was the incorrect total. I do have a tiny bit of experience in US government project work, including assignments to look into accounting methods in procurements.

The write up explains that a an audit by the Office of the Health and Human Services office of Inspector General identified the root causes of the alleged $600 million Web site. The source document was online when I checked on September 21, 2015, at this link. If you want this document, I suggest you download it. Some US government links become broken when maintenance, interns, new contractors, or site redesigns are implemented.

The news story, which is the hook for this blog post, does a good job of pulling out some of the data from the IG’s report; for example, a list of “big contractors behind” The list contains few surprises. Many of the names of companies were familiar to me, including that of Booz, Allen, where I once labored on a range of projects. There are references to additional fees from scope changes. I am confident, gentle reader, that you are familiar with scope creep. The idea is that the client, in the case of, needed to modify the tasks in the statement of work which underpins the contracts issued to the firms which perform the work. The government method is to rely on contractors for heavy lifting. The government professionals handle oversight, make certain the acquisition guidelines are observed, and plug assorted types of data into various US government back office systems.

The news story repeated the conclusion of the IG’s report that better training was need to make the type of project work better in the future.

My thoughts are that the news story ignored several important factors which in my experience provided the laboratory in which this online commerce experiment evolved.

First, the notion of a person in charge is not one that I encountered too often in my brushes with the US government. Many individuals change jobs, rotating from assignment to assignment, so newcomers are often involved after a train has left the station. In this type of staffing environment, the enthusiasm for digging deep and re-rigging the ship is modest or secondary to other tasks such as working on budgets for the next fiscal year, getting involved in new projects, or keeping up with the meetings which comprise the bulk of a professional’s work time. In short, decisions are not informed by a single individual with a desire to accept responsibility for a project. The ship sails on, moved by the winds of decisions by those with different views of the project. The direction emerges.

Second, the budget mechanisms are darned interesting. Money cannot be spent until the project is approved and the estimated funds are actually transferred to an account which can be used to pay a contractor. The process requires that individuals who may have never worked on a similar project create a team which involves various consultants, White House fellows, newly appointed administrators, procurement specialists with law degrees, or other professionals to figure out what is going to be done, how, what time will be allocated and converted to estimates of cost, and the other arcana of a statement of work. The firms who make a living converting statements of work into proposals to do the actual work. At this point, the disconnect between the group which defined the SOW and the firms bidding on the work becomes the vendor selection process. I will not explore vendor selection, an interesting topic outside the scope of this blog post. Vendors are selected and contracts written. Remember that the estimates, the timelines, and the functionality now have to be converted into the site or the F-35 aircraft or some other deliverable. What happens if the SOW does not match reality? The answer is a non functioning version of The cause, gentle reader, is not training.

Third, the vendors, bless their billable hearts, now have to take the contract which spells out exactly what the particular vendor is to do and then actually do it. What happens if the SOW gets the order of tasks wrong in terms of timing? The vendors do the best they can. Vendors document what they do, submit invoices, and attend meetings. When multiple vendors are involved, the meetings with oversight professionals are not the places to speak in plain English about the craziness of the requirements or the tasks specified in the contract. The vendors do their work to the best of their ability. When the time comes for different components to be hooked together, the parts usually require some tweaking. Think rework. Scope change required. When the go live date arrives, the vendors flip the switches for their part of the project and individuals try to use the system. When these systems do not work, the problem is a severe one. Once again: training is not the problem. The root cause is that the fundamental assumptions about a project were flawed from the git go.

Is there a fix? In the case of, there was. The problem was solved by creating the equivalent of a technical SWAT team, working in a very flexible manner with procurement requirements, and allocating money without the often uninformed assumptions baked into a routine procurement.

Did the fix cost money? Yes, do I know how much? No. My hunch is that there is zero appetite in the US government, at a “real” news service, a watchdog entity, or an in house accountant to figure out the total spent for Why do I know this? The accounting systems in use by most government entities are not designed to roll up direct and indirect costs with a mouse click. Costs are scattered and methods of payment pretty darned crazy.

Net net: Folks can train all day long. If that training focuses on systems and methods which are disconnected from the deliverable, the result is inefficiency, a lack of accountability, and misdirection from the root cause of a problem.

I have been involved in various ways with government work in the US since the early 1970s. One thing remains consistent: The foundational activities are uneven. Will the procurement process change? Forty years ago I used to think that the system would evolve. I was wrong.

Stephen E Arnold, September 21, 2015

Lexmark Chases New Revenue: Printers to DTM

September 11, 2015

I know what a printer is. The machine accepts instructions and, if the paper does not jam, outputs something I can read. Magic.

I find it interesting to contemplate my printers and visualize them as an enterprise content management system. Years ago, my team and I had to work on a project in the late 1990s involving a Xerox DocuTech scanner and printer. The idea was that the scanner would convert a paper document to an image with many digital features. Great idea, but the scanner gizmo was not talking to the printer thing. We got them working and shipped the software, the machines, and an invoice to the client. Happy day. We were paid.

The gap between that vision from a Xerox unit and the reality of the hardware was significant. But many companies have stepped forward to convert knowledge resident systems relying on experienced middle managers to hollowed out outfits trying to rely on software. My recollection is that Fulcrum Technologies nosed into this thorn bush with DOCSFulcrum a decade before the DocuTech was delivered by a big truck to my office. And, not to forget our friends to the East, the French have had a commitment to this approach to information access. Today, one can tap Polyspot or Sinequa for business process centric methods.

The question is, “Which of these outfits is making enough money to beat the dozens of outfits running with the other bulls in digital content processing land?” (My bet is on the completely different animals described in my new study CyberOSINT: Next Generation Information Access.)

Years later I spoke with an outfit called Brainware. The company was a reinvention of an earlier firm, which I think was called SER or something like that. Brainware’s idea was that its system could process text which could be scanned or in a common file format. The index allowed a user to locate text matching a query. Instead of looking for words, Brainware system used trigrams (sequences of three letters) to locate similar content.

Similar to the Xerox idea. The idea is not a new one.

I read two write ups about Lexmark, which used to be part of IBM. Lexmark is just down the dirt road from me in Lexington, Kentucky. Its financial health is a matter of interest for some folks in there here parts.

The first write up was “How Lexmark Evolved into an Enterprise Content Management Contender.” The main idea pivots on my knowing what content management is. I am not sure what this buzzword embraces. I do know that organizations have minimal ability to manage the digital information produced by employees and contractors. I also know that most organizations struggle with what their employees do with social media. Toss in the penchant units of a company have for creating information silos, and most companies look for silver bullets which may solve a specific problem in the firm’s legal department but leave many other content issues flapping in the wind.

According to the write up:

Lexmark is "moving from being a hardware provider to a broader provider of higher-value solutions, which are hardware, software and services," Rooke [a Lexmark senor manager] said.

Easy to say. The firm’s financial reports suggest that Lexmark faces some challenges. Google’s financial chart for the outfit displays declining revenues and profits:


The Brainware, ISYS Search Software, and Kofax units have not been able to provide the revenue boost I expected Lexmark to report. HP and IBM, which have somewhat similar strategies for their content processing units, have also struggled. My thought is that it may be more difficult for companies which once were good at manufacturing fungible devices to generate massive streams of new revenue from fuzzy stuff like software.

The write up does not have a hint of the urgency and difficulty of the Lexmark task. I learned from the article:

Lexmark is its own "first customer" to ensure that its technologies actually deliver on the capabilities and efficiency gains promoted by the company, Moody [Lexmark senior manager] said. To date, the company has been able to digitize and automate incoming data by at least 90 percent, contributing to cost reductions of 25 percent and a savings of $100 million, he reported. Cost savings aside, Lexmark wants to help CIOs better and more efficiently incorporate unstructured data from emails, scanned documents and a variety of other sources into their business processes.

The sentiment is one I encountered years ago. My recollection is that the precursor of Convera explained this approach to me in the 1980s when the angle was presented as Excalibur Technologies.

The words today are as fresh as they were decades ago. The challenge, in my opinion, remains.

I also read “How to Build an Effective Digital Transaction Management Platform.” This article is also eWeek, from the outfit which published “How Lexmark Evolved” piece.

What does this listicle state about Lexmark?

I learned that I need a digital transaction management system. A what? A DTM looks like workflow and information processing. I get it. Digital printing. Instead of paper, a DTM allows a worker to create a Word file or an email. Ah, revolutionary. Then a DTM automates the workflow. I think this is a great idea, but I seem to recall that many companies offer these services. Then I need to integrate my information. There goes the silo even if regulatory or contractual requirements suggest otherwise. Then I can slice and dice documents. My recollection is that firms have been automating document production for a while. Then I can use esignatures which are trustworthy. Okay. Trustworthy. Then I can do customer interaction “anytime, anywhere.” I suppose this is good when one relies on innovative ways to deal with customer questions about printer drivers. And I cannot integrate with “enterprise content management.” Oh, oh. I thought enterprise content management was sort of a persistent, intractable problem. Well, not if I include “process intelligence and visibility.” Er, what about those confidential documents relative to a legal dispute?

The temporal coincidence of a fluffy Lexmark write up and the listicle suggest several things to me:

  1. Lexmark is doing the content marketing that public relations and advertising professionals enjoy selling. I assume that my write up, which you are reading, will be an indication of the effectiveness of this one-two punch.
  2. The financial reports warrant some positive action. I think that closing significant deals and differentiating the Lexmark services from those of OpenText and dozens of other firms would have been higher on the priority list.
  3. Lexmark has made a strategic decision to use the rocket fuel of two ageing Atlas systems (Brainware and ISYS) and one Saturn system (Kofax’s Kapow) to generate billions in new revenue. I am not confident that these systems can get the payload into orbit.

Net net: Lexmark is following a logic path already stomped on by Hewlett Packard and IBM, among others. In today’s economic environment, how many federating, digital business process, content management systems can thrive?

My hunch is that the Lexmark approach may generate revenue. Will that revenue be sufficient to compensate for the decline in printer and ink revenues?

What are Lexmark’s options? Based on these two eWeek write ups, it seems as if marketing is the short term best bet. I am not sure I need another buzzword for well worn concepts. But, hey, I live in rural Kentucky and know zero about the big city views crafted down the road in Lexington, Kentucky.

Stephen E Arnold, September 11, 2015

Advice for Smart SEO Choices

August 11, 2015

We’ve come across a well-penned article about the intersection of language and search engine optimization by The SEO Guy. Self-proclaimed word-aficionado Ben Kemp helps website writers use their words wisely in, “Language, Linguistics, Semantics, & Search.” He begins by discrediting the practice of keyword stuffing, noting that search-ranking algorithms are more sophisticated than some give them credit for. He writes:

“Search engine algorithms assess all the words within the site. These algorithms may be bereft of direct human interpretation but are based on mathematics, knowledge, experience and intelligence. They deliver very accurate relevance analysis. In the context of using related words or variations within your website, it is one good way of reinforcing the primary keyword phrase you wish to rank for, without over-use of exact-match keywords and phrases. By using synonyms, and a range of relevant nouns, verbs and adjectives, you may eliminate excessive repetition and more accurately describe your topic or theme and at the same time, increase the range of word associations your website will rank for.”

Kemp goes on to lament the dumbing down of English-language education around the world, blaming the trend for a dearth of deft wordsmiths online. Besides recommending that his readers open a thesaurus now and then, he also advises them to make sure they spell words correctly, not because algorithms can’t figure out what they meant to say (they can), but because misspelled words look unprofessional. He even supplies a handy list of the most often misspelled words.

The development of more and more refined search algorithms, it seems, presents the opportunity for websites to craft better copy. See the article for more of Kemp’s language, and SEO, guidance.

Cynthia Murrell, August 11, 2015

Sponsored by, publisher of the CyberOSINT monograph


Chrome Restricts Extensions amid Security Threats

June 22, 2015

Despite efforts to maintain an open Internet, malware seems to be pushing online explorers into walled gardens, akin the old AOL setup. The trend is illustrated by a story at PandoDaily, “Security Trumps Ideology as Google Closes Off its Chrome Platform.” Beginning this July, Chrome users will only be able to download extensions for that browser  from the official Chrome Web Store. This change is on the heels of one made in March—apps submitted to Google’s Play Store must now pass a review. Extreme measures to combat an extreme problem with malicious software.

The company tried a middle-ground approach last year, when they imposed the our-store-only policy on all users except those using Chrome’s development build. The makers of malware, though, are adaptable creatures; they found a way to force users into the development channel, then slip in their pernicious extensions. Writer Nathanieo Mott welcomes the changes, given the realities:

“It’s hard to convince people that they should use open platforms that leave them vulnerable to attack. There are good reasons to support those platforms—like limiting the influence tech companies have on the world’s information and avoiding government backdoors—but those pale in comparison to everyday security concerns. Google seems to have realized this. The chaos of openness has been replaced by the order of closed-off systems, not because the company has abandoned its ideals, but because protecting consumers is more important than ideology.”

Better safe than sorry? Perhaps.

Cynthia Murrell, June 22, 2015

Sponsored by, publisher of the CyberOSINT monograph

Free Book from OpenText on Business in the Digital Age

May 27, 2015

This is interesting. OpenText advertises their free, downloadable book in a post titled, “Transform Your Business for a Digital-First World.” Our question is whether OpenText can transform their own business; it seems their financial results have been flat and generally drifting down of late. I suppose this is a do-as-we-say-not-as-we-do situation.

The book may be worth looking into, though, especially since it passes along words of wisdom from leaders within multiple organizations. The description states:

“Digital technology is changing the rules of business with the promise of increased opportunity and innovation. The very nature of business is more fluid, social, global, accelerated, risky, and competitive. By 2020, profitable organizations will use digital channels to discover new customers, enter new markets and tap new streams of revenue. Those that don’t make the shift could fall to the wayside. In Digital: Disrupt or Die, a multi-year blueprint for success in 2020, OpenText CEO Mark Barrenechea and Chairman of the Board Tom Jenkins explore the relationship between products, services and Enterprise Information Management (EIM).”

Launched in 1991, OpenText offers tools for enterprise information management, business process management, and customer experience management. Based in Waterloo, Ontario, the company maintains offices around the world.

Cynthia Murrell, May 27, 2015

Sponsored by, publisher of the CyberOSINT monograph

The Dichotomy of SharePoint Migration

May 7, 2015

SharePoint Online gets good reviews, but only from critics and those who are utilizing SharePoint for the first time. Those who are sitting on huge on-premises installations are dreading the move and biding their time. It is definitely an issue stemming from trying to be all things to all people. Search Content Management covers the issue in their article, “Migrating to SharePoint Online is a Tale of Two Realities.”

The article begins:

“Microsoft is paving the way for a future that is all about cloud computing and mobility, but it may have to drag some SharePoint users there kicking and screaming. SharePoint enables document sharing, editing, version control and other collaboration features by creating a central location in which to share and save files. But SharePoint users aren’t ready — or enthused about — migrating to . . . SharePoint Online. According to a Radicati Group survey, only 23% of respondents have deployed SharePoint Online, compared with 77% that have on-premises SharePoint 2013.”

If you need to keep up with how SharePoint Online may affect your organization’s installation, or the best ways to adapt, keep an eye on Stephen E. Arnold is a longtime leader in search and distills the latest tips, tricks, and news on his dedicated SharePoint feed. SharePoint Online is definitely the future of SharePoint, but it cannot afford to get there at the cost of its past users.

Emily Rae Aldridge, May 7, 2015

Sponsored by, publisher of the CyberOSINT monograph


Visual Data Mapper Quid Raises $39M

April 14, 2015

The article on TechCrunch titled Quid Raises $39M More to Visualize Complex Ideas explains the current direction of Quid. Quid, the business analytics company interested in the work of processing vast amounts of data to build visual maps as well as branding and search, has been developing new paths to funding. The article states,

“When we wrote about the company back in 2010, it was focused on tracking emerging technologies, but it seems to have broadened its scope since then. Quid now says it has signed up 80 clients since launching the current platform at the beginning of last year.The new funding was led by Liberty Interactive Corporation, with participation from ARTIS Ventures, Buchanan Investments, Subtraction Capital, Tiger Partners, Thomas H. Lee Limited Family Partnership II, Quid board member Michael Patsalos-Fox…”

Quid also works with such brands as Hyundai, Samsung and Microsoft, and is considered to be unique in its approach to the big picture of tech trends. The article does not provide much information as to what the money is to be used for, unless it is to do with the changes to the website, which was once called the most pretentious of startup websites for its detailed explanation of its primary and secondary typefaces and array of titular allusions.

Chelsea Kerwin, April 14, 2014

Stephen E Arnold, Publisher of CyberOSINT at

Set Data Free from PDF Tables

April 13, 2015

The PDF file is a wonderful thing. It takes up less space than alternatives, and everyone with a computer should be able to open one. However, it is not so easy to pull data from a table within a PDF document. Now, Computerworld informs us about a “Free Tool to Extract Data from PDFs: Tabula.” Created by journalists with assistance from organizations like Knight-Mozilla OpenNews, the New York Times and La Nación DATA, Tabula plucks data from tables within these files. Reporter Sharon Machlis writes:

“To use, download the software from the project website . It runs locally in your browser and requires a Java Runtime Environment compatible with Java 6 or 7. Import a PDF and then select the area of a table you want to turn into usable data. You’ll have the option of downloading as a comma- or tab-separated file as well as copying it to your clipboard.

“You’ll also be able to look at the data it captures before you save it, which I’d highly recommend. It can be easy to miss a column and especially a row when making a selection.”

See the write-up for a video of Tabula at work on a Windows system. A couple caveats: the tool will not work with scanned images. Also, the creators caution that, as of yet, Tabula  works best with simple table formats. Any developers who wish to get in on the project should navigate to its GitHub page here.

Cynthia Murrell, April 13, 2015

Stephen E Arnold, Publisher of CyberOSINT at

Vilocity 2.0 Released by Nuwave

March 17, 2015

The article on Virtual Strategy Magazine titled NuWave Enhances their Vilocity Analytic Framework with Release of Vilocity 2.0 Update promotes the upgraded framework as a mixture of Oracle Business Intelligence Enterprise Edition and Oracle Endeca Information Discovery. The ability to interface across both of these tools as well as include components from both in a single dashboard makes this a very useful program, with capabilities such as exporting to Microsoft to create slideshows, pre-filter and the ability to choose sections of a page and print across both frameworks. The article explains,

“The voices of our Vilocity customers were vital in the Vilocity 2.0 release and we value their input,” says Rob Castle, NuWave’s Chief Technology Officer… The most notable Vilocity deployment NuWave has done is for the U.S. Army EMDS Program. From deployment and through continuous support NuWave has worked closely with this client to communicate issues and identify tools that could improve Vilocity. The Vilocity 2.0 release is a culmination of NuWave’s desire for their clients to be successful.”

It looks like they have found a way to make Endeca useful. Users of the Vilocity Analytic framework will be able to find answers to the right questions as well as make new discoveries. The consistent look and feel of both systems should aid users in getting used to them, and making the most of their new platform.

Chelsea Kerwin, March 17, 2014

Stephen E Arnold, Publisher of CyberOSINT at

Next Page »