Blockchain Quote to Note: The Value of Big Data as an Efficient Error Reducer
September 6, 2017
I read “Blockchains for Artificial Intelligence: From Decentralized Model Exchanges to Model Audit Trails.” The foundation of the write up is that blockchain technology can be used to bring more control to data and models. The idea is an interesting one. I spotted a passage tucked into the lower 20 percent of the article which I judged to be a quote to note. Here’s the passage I highlighted:
as you added more data — not just a bit more data but orders of magnitude more data — and kept the algorithms the same, then the error rates kept going down, by a lot. By the time the datasets were three orders of magnitude larger, error was less than 5%. In many domains, there’s a world of difference between 18% and 5%, because only the latter is good enough for real-world application. Moreover, the best-performing algorithms were the simplest; and the worst algorithm was the fanciest. Boring old perceptrons from the 1950s were beating state-of-the-art techniques.
Bayesian methods date from the 18th century and work well. Despite LaPlacian and Markovian bolt ons, the drift problem bedevils some implementations. The solution? Pump in more training data, and the centuries old techniques work like a jazzed millennial with a bundle of venture money.
Care to name a large online outfit which may find this an idea worth nudging forward? I don’t think it will be Verizon Oath or Tronc.
Stephen E Arnold, September 6, 2017