[MapReduce-user] is HDFS RAID "data locality" efficient?
Aug 9, 2012 at 10:56 am
Thanks a lot everybody for your replies.
Your ideas about cold and hot data using different storage policies prove to
be very interesting.
: Ok... So under Apache Hadoop, how do you specify the location of when and where a directory will be created on HDFS? As an example, if I want to create a /coldData directory in HDFS as a place to store my older data sets, How does that get assigned specifically to a RAIDed HDFS? (Or even specific machines?) I know I can do this in MapR's distribution, but I am not aware of this feature being made available in the Apache based releases? Is this part of the latest feature set? Thx -Mike
How can I know how many mappers created?
Query over efficient utilization of cluster using fair scheduling
Efficient sort -u + merge, in Hadoop M/R?
Lack of data locality in Hadoop-0.20.2
Improve data locality for MR job processing tar.gz files
Bulk Import & Data Locality
How to use CombineFileInputFormat in Hadoop?
Do we shoot ourselves by using all task slots?
Locks in M/R framework
Loading data from S3
16 of 16
Aug 8, '12 at 4:46p
Aug 9, '12 at 10:56a
14 users in discussion
Sourygna Luangsay (2)
Michael Segel (2)
Vinicius Melo (1)
Gabriel Armelin (1)
Mayuran Yogarajah (1)
Steve Loughran (1)
Gaurav Sharma (1)
Avram Aelony (1)
D'Souza, Clive V (1)
Ajit Ratnaparkhi (1)
Miles Trebilco (1)
Arun Prakash (1)
Groups & Organizations
site design / logo © 2021 Grokbase