Hello everyone,

I am curious to know if anyone has tried using map-reduce across multiple
data centers? The use case that I have in my mind where the dataset is
geographically distributed across multiple data centers and it may be not be
cost effective to move the data to a single site (e.g. due to limitation of
network bandwidth across sites etc.) How such scenario is taken care today?

As per my understanding, there is a feature request filed against HDFS to be
distributed across data centers (e.g. for disaster recovery etc.). For
details, please refer to following link
https://issues.apache.org/jira/browse/HDFS-1432

Can anyone share any thoughts regarding pros and cons of this approach?

Thanks
Hrishikesh

Search Discussions

  • Deepika Khera at Nov 3, 2010 at 1:05 am
    I am using hadoop v0.20.2.

    Is there any configuration that I could set so the job tracker cleans the job configuration xml files in the hadoop logs directory. Or do we need to delete them manually everytime?

    Deepika

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedNov 2, '10 at 11:26p
activeNov 3, '10 at 1:05a
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Deepika Khera: 1 post Hrishikesh Gadre: 1 post

People

Translate

site design / logo © 2022 Grokbase