I've just tested MapReduce in C++ against the Java version.
I did run WordCount included in 0.14.1 release version on a 1 node Hadoop
cluster (Pentium D with 2GB of RAM).
There were 2 input files (one 4.5MB file + one 36MB file).
I also did take Combiner out of Java version WordCount MapReduce, as there
was no Combiner used for C++ version.
The result is.... as many of you have guessed, Java version won the race big
time. Java version was about 4 times quicker.
Here is more detailed result.
C++ Version Ratio (Java : C++) Total Time Taken
89 364 1 : 4 Longest Time Taken for Map
41 83 1 : 2 Longest Time Taken for Reduce
58 264 1 : 4.5
Any guess or idea on how to improve the performance of C++ MapReduce?
Taeho Kang [tkang.blogspot.com]