FAQ
Thank you for a great tip - reusing the key/value objects after
output.collect.

I have one more question. Is the map output data stored on the local disk of
the instance or is it written out to HDFS. Specifically, if a single map
outputs more data than the storage size of its local disk, does the job fail
(or can one assume one has the full space of the disk available in HDFS)?

Cheers,
Dev

On Wed, Jul 29, 2009 at 10:06 AM, Jason Venner wrote:

In hadoop 18 and beyond, the key and value do not have to Implement
Writable.
As a general rule, the key and value objects passed to the map task will be
the same objects, with a fresh value initialized by the record reader.
The output.collect method will serialize the value during the call (unless
you are using the chainmapping from 19+), and you are free to reset the
values stored in the key value objects passed to output.collect after the
call.

It is a common practice to have a class field containing an object instance
of the output key or value type, which are used for transformations,
instead
of allocating a new key or value instance in each call to map or reduce.
On Tue, Jul 28, 2009 at 11:29 AM, Devajyoti Sarkar wrote:

Thanks.

Dev
On Wed, Jul 29, 2009 at 2:27 AM, Todd Lipcon wrote:

On Tue, Jul 28, 2009 at 11:24 AM, Devajyoti Sarkar <dsarkar@q-kk.com>
wrote:
Hi,

In the hadoop documentation it says that all key-value classes need
to
implement Writable to allow serialization and de-serialization of
outputs
between mappers and reducers. Is this also necessary for key/value
pairs
sent between the RecordReader and the Mapper (as well as the Reducer
and
the
RecordWriter)? I assume that each of these two cases, classes are
instantiated in the same VM. So is it safe to assume that key/value
pairs
are sent by reference instead of serialization/deserialization? If
so,
my
specific application may get a performance boost. Please do let me
know
if
this so.
Yes, this is correct. The values that come out of RecordReaders and go into
RecordWriters do not need to implement Writable.

-Todd


--
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 5 of 6 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 28, '09 at 6:24p
activeJul 29, '09 at 6:50p
posts6
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase