LinkedIn: A Pinot for a Flavor Profile with a Narrow Market
June 13, 2015
LinkedIn is the social network for professionals. The company meets the needs of individuals who want to be hired and companies looking to find individuals to fill jobs. We use the system to list articles I have written. If you examine some of the functions of LinkedIn, you may discover that sorting is a bit of disappointment.
LinkedIn has been working hard to find technical solutions to its data management challenges. One of the company’s approaches has been to create software, make it available as open source, and then publicize the contributions.
A recent example is the article “LinkedIn Fills Another SQL-on-Hadoop Niche.” What is interesting in the write up is that the article does not make clear what LinkedIn does with this software home brew. I learned:
Pinot was designed to provide the company with a way to ingest “billions of events per day” and serve “thousands of queries per second” with low latency and near-real-time results — and provide analytics in a distributed, fault-tolerant fashion.
On the surface, it seems that Hadoop is used as a basked. Then the basket’s contents is filtered using SQL queries. But for me the most interesting information in the write up is what the system does not do; for example:
- The SQL-like query language used with Pinot does not have the ability to perform table joins
- The data is (sic) strictly read-only
- Pinot is narrow in focus.
Has LinkedIn learned that its internal team needs more time and money to make Pinot a mash up with wider appeal? Commercial companies going open source is often a signal that the assumptions of the in house team have collided with management’s willingness to pay for a sustained coding commitment.
Stephen E Arnold, June 13, 2015