[MapReduce-user] is HDFS RAID "data locality" efficient?
Aug 8, 2012 at 5:44 pm
On Wed, Aug 8, 2012 at 2:39 PM, Arun Prakash wrote:
Arun Prakash C.K
: Best Regards Arun Prakash C.K
: I tend to think the only way this is going to 'really' be fixed is one email to all users explaining what's happened re: consolidation, instructions on how to subscribe to the new list, then nuke the whole list and start anew. it might be an extreme response, but I think the number of failing attempts is showing that extreme is what's actually needed. -- Even the Magic 8 ball has an opinion on email clients: Outlook not so good.
: Agreed with Steve. That is most important use of HDFS RAID, where you consume less disk space with same reliability and availability guarantee at cost of processing performance. Most of data in hdfs is cold data, without HDFS RAID you end up maintaining 3 replicas of data which is hardly going to be processed again, but you cant remove/move this data to separate archive because if required processing should be as soon as possible. -Ajit
How can I know how many mappers created?
Query over efficient utilization of cluster using fair scheduling
Efficient sort -u + merge, in Hadoop M/R?
Lack of data locality in Hadoop-0.20.2
Improve data locality for MR job processing tar.gz files
Bulk Import & Data Locality
How to use CombineFileInputFormat in Hadoop?
Do we shoot ourselves by using all task slots?
Locks in M/R framework
Loading data from S3
9 of 16
Aug 8, '12 at 4:46p
Aug 9, '12 at 10:56a
14 users in discussion
Sourygna Luangsay (2)
Michael Segel (2)
Vinicius Melo (1)
Gabriel Armelin (1)
Mayuran Yogarajah (1)
Steve Loughran (1)
Gaurav Sharma (1)
Avram Aelony (1)
D'Souza, Clive V (1)
Ajit Ratnaparkhi (1)
Miles Trebilco (1)
Arun Prakash (1)
Groups & Organizations
site design / logo © 2021 Grokbase