Is there any way that we can run a particular job in a hadoop on subset of
My problem is I don't want to use all the nodes to run some job,
I am trying to make Job completion Vs No. of nodes graph for a particular
One way to do is I can remove datanodes, and then see how much time the job
Just for curiosity sake, want to know is there any other way possible to do
this, without removing datanodes.
I am afraid, if I remove datanodes, I can loose some data blocks that reside
on those machines as I have some files with replication = 1 ?
Grokbase › Groups › Hadoop › common-user › September 2011