Valuable Primer on Data Logs

January 24, 2014

Who knew LinkedIn could be so useful? The site’s Engineering blog supplies an thorough look at logs in, “The Log: What Every Software Engineer Should Know About Real-Time Data’s Unifying Abstraction.” Writer and LinkedIn Engineer Jay Kreps aims to fill what he sees as a large gap in the education of most software engineers. The site’s transition last year from a centralized database to a distributed, Hadoop-based system opened his eyes.

Kreps writes:

“One of the most useful things I learned in all this was that many of the things we were building had a very simple concept at their heart: the log. Sometimes called write-ahead logs or commit logs or transaction logs, logs have been around almost as long as computers and are at the heart of many distributed data systems and real-time application architectures. You can’t fully understand databases, NoSQL stores, key value stores, replication, paxos, hadoop, version control, or almost any software system without understanding logs; and yet, most software engineers are not familiar with them. I’d like to change that. In this post, I’ll walk you through everything you need to know about logs, including what is log and how to use logs for data integration, real time processing, and system building.”

He isn’t kidding. The extensive article is really a mini-course that any programmer who hasn’t already mastered logs should look into. Part one is, titled “What is a log?”, covers logs in general as well as their place in both databases and distributed systems. Part two discusses data integration, including potential complications, the relationship to a data warehouse, log files, and building a scalable log. Real-time stream processing is discussed in part three, as well as data flow graphs, real-time processing, and log compaction. Part four covers system building, delving into the prospect of unbundling and where logs fits into system architecture. At the end, Kreps supplies an extensive list of resources for further study.

Cynthia Murrell, January 24, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Comments

One Response to “Valuable Primer on Data Logs”

  1. Sprinkler Repair Monument CO on April 15th, 2014 3:47 pm

    I pay a quick visit daily a few websites and sites to read
    content, but this webpage gives feature based posts.

  • Archives

  • Recent Posts

  • Meta