A Xoogler Explains Why Big Data Is Going Nowhere Fast

March 3, 2023

The essay “Big Data Is Dead.” One of my essays from the Stone Age of Online used the title “Search Is Dead” so I am familiar with the trope. In a few words, one can surprise. Dead. Final. Absolute, well, maybe. On the other hand, the subject either Big Data or Search are part of the woodwork in the mini-camper of life.

I found this statement interesting:

Modern cloud data platforms all separate storage and compute, which means that customers are not tied to a single form factor. This, more than scale out, is likely the single most important change in data architectures in the last 20 years.

The cloud is the future. I recall seeing price analyses of some companies’ cloud activities; for example, “The Cloud vs. On-Premise Cost: Which One is Cheaper?” In my experience, cloud computing was pitched as better, faster, and cheaper. Toss in the idea that one can get rid of pesky full time systems personnel, and the cloud is a win.

What the cloud means is exactly what the quoted sentence says, “customers are not tied to a single form factor.” Does this mean that the Big Data rah rah combined with the sales pitch for moving to the cloud will set the stage for more hybrid sets up a return to on premises computing. Storage could become a combination of on premises and cloud base solutions. The driver, in my opinion, will be cost. And one thing the essay about Big Data does not dwell on is the importance of cost in the present economic environment.

The arguments for small data or subsets of Big Data is accurate. My reading of the essay is that some data will become a problem: Privacy, security, legal, political, whatever. The essay is an explanation for what “synthetic data.” Google and others want to make statistically-valid, fake data the gold standard for certain types of processes. In the data are a liability section of the essay, I noted:

Data can suffer from the same type of problem; that is, people forget the precise meaning of specialized fields, or data problems from the past may have faded from memory.

I wonder if this is a murky recasting of Google’s indifference to “old” data and to date and time stamping. The here and now not then and past are under the surface of the essay. I am surprised the notion of “forward forward” analysis did not warrant a mention. Outfits like Google want to do look ahead prediction in order to deal with inputs newer than what is in the previous set of values.

You may read the essay and come away with a different interpretation. For me, this is the type of analysis characteristic of a Googler, not a Xoogler. If I am correct, why doesn’t the essay hit the big ideas about cost and synthetic data directly?

Stephen E Arnold, March 3, 2023


Comments are closed.

  • Archives

  • Recent Posts

  • Meta