One more thing about your original numbers.

Are they repeatable ?

- milind

----- Original Message -----
From: Owen O'Malley <oom@yahoo-inc.com>
To: hadoop-user@lucene.apache.org <hadoop-user@lucene.apache.org>
Sent: Thu Nov 08 19:10:30 2007
Subject: Re: sort speeds under java, c++, and streaming
On Nov 8, 2007, at 5:14 PM, Milind A Bhandarkar wrote:

Does pipes deserializes and serializes data for the identity
mappers or just "passes it through" ? (Streaming converts input to
text, afaik)
Pipes serializes the objects to bytes and sends them to the C++
program. The C++ program gets them as C++ strings, which are
effectively byte arrays. Pipes does not do the conversion to Java
strings that streaming does. Therefore, pipes can support arbitrary
Writable objects. Hopefully in the future, we can change the map/
reduce api to provide access to the raw bytes in the mapper and
reducer as an option. In that case, pipes would not need to serialize
at all.

-- Owen

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 9 of 14 | next ›
Discussion Overview
groupcommon-user @
postedNov 9, '07 at 1:03a
activeNov 9, '07 at 8:15a



site design / logo © 2022 Grokbase