MapReduce: Google’s Database Probe Launched
August 26, 2008
Update 2, August 29, 2008, 1 50 pm Eastern
There’s an interesting and possibly relevant story on CNet here. Matt Asay wrote “Google’s Weird Ways with Open Source Licenses,” which became available on August 29, 2008. The core of the story is in the title. Open source licenses appear to be handled in a Googley way; that is, Google’s way. I sure don’t want to dispute the assertions that MapReduce as used by Aster Data and Greenplum is in any way affected by these “weird ways”. I do want to point you to this article and quote one sentence that was of interest to me:
As for the MPL, while DiBona doesn’t state it outright, I suspect that Google’s decision to re-up its commitment to Mozilla for three more years probably involved some strained discussions about Google’s weird decision to dump the MPL, one of the industry’s most popular open-source licenses.Regardless, all is well that ends well. Google came to the right decision, however odd the logic.
You can the Steve Shankland article, which touches upon the great MapReduce technology here. For something as simple as making code available as open source, there’s a lot of huffing and puffing. I’m watching for signs of smoke now. Wizards, pundits, and Googley types are welcome to add links, correct either of these authors, or opine with limited data via the comments on this addled goose’s Web log. What’s next for open source? The programmable search engine technology. That would be useful here in the hills of Kentucky.
Update 1, August 29, 2008, around 11 am Eastern
My comment about MapReduce triggered some keyboarding by various wizards. Thanks for the inputs. The point of the flurry is that MapReduce doesn’t have anything to do with Google. MapReduce is “in the wild” and anyone can make use of it. Nevertheless, I remain keenly interested in this technology for several reasons:
- MapReduce was the subject of a lecture given at the University of Washington several years ago by Jeffrey Dean and then written up as a paper. You can snag a copy here.
- Google has been careful about the scope of its enterprise ambitions with regard to data management, data base, and data analysis. The company has been sufficiently circumspect as to make the key players in the database and data management market confident that Google’s enterprise ambitions are focused on search, maps, and light weight cloud applications. Forget the dashboard I wrote about. It’s light weight too.
- Aster Data is a company that came on my radar because of its “Googley nature”. I have picked up some suggestive comments about the robustness of the Aster Data technology and I learned from Aster Data that it is not interested in search. I believe that statement but I watch this space for interesting developments.
From my point of view, MapReduce–open source or any other variety–intrigues me. Based on my observation of things Google from my remote hide away in Harrod’s Creek, Kentucky, my hunch is that Google has a tiny bit of interest in how Aster Data and Greenplum use MapReduce, how their customers respond, and what interest the technology generates. In my lingo, Google learns from its environment. That’s why I sub titled my Google Version 2.0 study “the calculating predator”. Watching, learning, waiting–could this be part of the Map Reduce or broader Google goodness? I will let you know what I snag in my crawler.
Original Post Below
I wrote about Aster Data several weeks ago. If you are not familiar with the company, you may want to look at my article or navigate to the Aster Data Web site and get up to speed. It is an important company and is in the process of becoming more important.
InfoWorld’s “Database Vendors Add Google’s MapReduce” here reports that Google has cut a deal with Aster Data and Greenplum for Google’s nifty method of combining two separate functions into one instruction, reducing the “time” and computational cycles required to perform a task essential to chopping results from a larger data set. MapReduce is useful for certain operations with peta scale data.
Has Google entered the enterprise data management market? Not yet. Like Google’s interaction with Salesforce.com, Google is in “learn” mode. MapReduce by itself is not a complete data solution, but it provides some horsepower to Aster Data and Greenplum.
Will Google challenge IBM, Microsoft, and Oracle among others in the DBMS market? Google will watch and learn. Google has some serious data management capabilities in development. MapReduce is a golden oldie at Google.
When Google figures out what it wants to do to cash in on the pain many companies experience when using traditional database management systems, the Google will leap frog what’s available. For now, Google is no threat to DBMS vendors. In the future, who knows, probably not even Google until it gets enough hard data to justify a decision one way or the other.
Stephen Arnold, August 26, 2008