FAQ
Hi Owen,

Can you provide more details of your test? In particular what was the Java
Map-reduce program that your ran? Was it
src/examples/org/apache/hadoop/examples/Sort.java ? Also, I can't find
anything called "RandomTextWriter" in the source tarball, can you point me
to it? Thanks.

- Doug
On Nov 8, 2007 5:03 PM, Owen O'Malley wrote:

I set up a little benchmark on a 39 node cluster to sort 40gb of
random text data (generated by RandomTextWriter using key length:
1-10 words and value length: 0-200 words, data uncompressed). The
runtimes in minutes are:

Java: 4:22
C++ (Pipes): 3:50
Streaming: 4:44

I was surprised to find that Pipes out performed Java, even with the
extra process. I suspect it was because of the buffering between the
input and output of Pipes.

-- Owen

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 5 of 14 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedNov 9, '07 at 1:03a
activeNov 9, '07 at 8:15a
posts14
users5
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase