Some Google guys have recently published a paper about a Java library for helping on developing and optimizing chains of map-reduce jobs. It is called FlumeJava.
The library is very interesting. From my view, it is clear that it simplifies developing on map-reduce. Instead of hand writing your jobs and chaining them manually, it lets you define your computations using some java syntax with the help of some immutable collections, and leaves the library the responsibility to find the best execution plan.
You get your code splitted in several pieces of code when you develop using standard map-reduce jobs. That makes the code fragmented and less clear. FlumeJava allows you to keep your business logic closer.
Well, let’s see if that amazing Hadoop guys implement something similar for the community.
UPDATE (2010-11-12): The amazing Hadoop guys has started to move. Ted Dunning has created Plume, the "Hadoop FlumeJava". My friend Pere Ferrera is also colaborating with the Plume development.
2 comments:
Check out Plume, an open source clone of FlumeJava. See http://github.com/tdunning/Plume
Thanks Ted!
I already knew the project because my friend Pere Ferrera is colaborating with the Plume development, but I'm going to update the post.
Regards!
Post a Comment