Hi,

When hadoop is running in cluster, the output of the Reducers are
saved in HDFS. The MapReduce have also location awareness on where is
saved the data?

For example, we've TT1 running in Machine1, and TT2 running in
Machine2. The replication of HDFS is 3. The Reduce Task RT1 is running
in TT1. So, when the reducer saves output in HDFS, 2 replicas of the
output goes to TT1 and the third one goes to TT2? Is this what
happens?

Thanks,

--
Pedro

Search Discussions

  • Mahadev Konar at Feb 8, 2011 at 10:10 pm
    Hi Pedro,
    You can read abt the hdfs placement policy at:

    http://hadoop.apache.org/common/docs/r0.20.2/hdfs_design.html

    thanks
    mahadev
    On Fri, Feb 4, 2011 at 7:06 AM, Pedro Costa wrote:
    Hi,

    When hadoop is running in cluster, the output of the Reducers are
    saved in HDFS. The MapReduce have also location awareness on where is
    saved the data?

    For example, we've TT1 running in Machine1, and TT2 running in
    Machine2. The replication of HDFS is 3. The Reduce Task RT1 is running
    in TT1. So, when the reducer saves output in HDFS, 2 replicas of the
    output goes to TT1 and the third one goes to TT2? Is this what
    happens?

    Thanks,

    --
    Pedro

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedFeb 4, '11 at 3:07p
activeFeb 8, '11 at 10:10p
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Mahadev Konar: 1 post Pedro Costa: 1 post

People

Translate

site design / logo © 2022 Grokbase