Google and the Right to Be Mostly Removed from an Index

September 14, 2018

Yeah, the deletion thing.

I am able to recall some exciting “deletion” events over my 50 year working career. Let me recount one amusing deletion event. The year is 1980 (give or take a year or two). The topic was the Capital Holding IBM mainframe system running the mission critical IBM CICS (Customer Information and Control System). The CICS system and its many components was designed to make it theoretically impossible to delete a record when a high priority process was running in memory. Yes, gentle reader, in memory with data not yet written to disc. The technically fascinating Capital Holding computer center and its mainframes are no more, and on that day in 1980 neither was the data which, according to the IBM CICS manual could not be deleted.

Yeah, well.

I did not work at Capital Holding; I worked at the Louisville Courier Journal database unit, and we supported our electronic products on IBM MVS TSO systems at Bell Labs. Close enough for horseshoes, right. I sat in the meeting for an hour and contributed one comment, “Fiddling with live CICS processes by deleting a record is not a good idea. Find a work around. I have to go.” I left the wizards of the insurance business to sort out the reality of what happens when you poke around in an IBM in memory process. By the way, you can kill an AS/400 database process and the data with an ill advised delete.

At Capital Holding, one of the Job Control Language crew managed to issue a command and trash the database and whatever else was in memory at the time.

Yeah, well.

In retrospect, this was a useful reminder to me that one does not remove things from an index. One finds a way to leave the thing in the index and make sure the thing does not show up in a query. To the outsider, the data are gone. To someone who knows how the “gone” was implemented, the data are still in the index, probably on disc somewhere, and maybe on a tape in an Iron Mountain cave too. But “gone” means that the managers and lawyers in carpetland can demonstrate the datum is indeed gone.

Yeah, well. Like the internal Google video, gone is relative.

I thought of this when I read “Google Digs In Heels Over Global Expansion of EU’s Right to Be Forgotten.” The write up does not explain that stuff in an index may never really go away. I don’t think the EU cares, and I know that users who want information about people who want certain information to never be displayed don’t care about how. The goal is to have the information disappear.

Yeah, well.

Google may have some business, political, social, and economic reasons to stop this deletion demand.

From my rural Kentucky redoubt, I wonder if the Google wants to figure out how to delete information from an index without creating more work, more computational costs, and more headaches when the CICS behavior surfaces somewhere in the sensitive plant that is the global Google computing infrastructure. Of course, one can rebuild the indexes and really make the datum disappear, but rebuilds are interesting. Really expensive too when measured in terms of machine time, lost uptime, etc., etc.

The write up does a good job of explaining the non technical aspects of the issue.

I am sitting here wondering if Google when forced to delete lots of stuff from its indexes is concerned about the specific methodology of removing and removing and removing from a dynamic, distributed index.

IBM asserted that its delete function could not operate when a CICS process was chugging along.

Yeah, well.

Stephen E Arnold, September 14, 2018

Comments

Got something to say?





  • Archives

  • Recent Posts

  • Meta